Big Data & Machine Learning Cloud OnBoard 1 2 1 Short meaningful column names reduce storage and RPC overhead 3 2 5 3 Design row key with most 6 common query in mind 5 Column families is a quick 7 way to get some hierarchy 6 8 7 9 8 Row Key Column data 10 9 11 NASDAQ#1426535612045 MD:SYMBOL: MD:LASTSALE: MD:LASTSIZE: MD:TRADETIME: MD:EXCHANGE: 10 ZXZZT 600.58 300 1426535612045 NASDAQ 12 11 13 12 14 13 15 14 Design row key to minimize hotspots Use short column names 16 Designed for sparse tables 15 1716 17 18 Big Data & Machine Learning Cloud OnBoard 1 2 1 Can work with Bigtable using the HBase API 3 2 import org.apache.hadoop.hbase.*; 5 import org.apache.hadoop.hbase.client.*; 3 import org.apache.hadoop.hbase.util.*; 6 5 7 byte[] CF = Bytes.toBytes("MD"); // column family 6 Connection connection = ConnectionFactory.createConnection(...) 8 Table table = null; 7 try { 9 table = connection.getTable(TABLE_NAME); 8 Put p = new Put(Bytes.toBytes("NASDAQ#GOOG #1234561234561")); 10 9 p.addColumn(CF, Bytes.toBytes("SYMBOL"), Bytes.toBytes("GOOG")); 11 p.addColumn(CF, Bytes.toBytes("LASTSALE"), Bytes.toBytes(742.03d)); 10 ... 12 table.put(p); 11 } finally { 13 if (table != null) table.close(); 12 } 14 13 15 14 16 15 1716 17 18 39
Google Cloud Manual Page 40 Page 42