Movatterモバイル変換

Who Am I?• Jonathan• MySQL Consultant• Working with MySQL since 2007• Specialize in SQL, Indexing and Reporting (Big Data)

Who is this for?* Smilies indicate ability to control area

What Will I Cover? Domain Knowledge This Much0% 20% 100%

Solutions this tutorial will coverOccurrences Problems

serenesimplycomplicated.blogspot.co.uk/2012/07/developing-direction.html

What Will I Cover?• The top 20% of the strategies to resolve 80% of your performance problems• The strategies that are within reach of developers• Strategies that are more common and more established • From my experience. • To reduce risk

Table of Contents Part One Part Three• Indexes • Read Cache• Finding • Scaling Reads Bottlenecks • Reporting Part Two • Write Buffers• Partitioning • Scaling Writes• Intensive Table • Sharding Optimization

Indexes• Advantages • Speed – Use the right path• Now used in NoSQL stores• “A properly indexed database will give you very few problems” – me• My blog – “Indexing and Caching”

B-Tree Indexes(http://20bits.com/article/interview-questions-database-indexes)

(http://www.youtube.com/watch?v=coRJrcIYbF4)

Choosing the best IndexPrevent Readingfrom Disk *3 *2 Prevent Extra *1 ProcessingPrevent Table Scans

Indexes• 1 Star – EXPLAIN • Type • const - where id=1 • ref - where location='london' • eq_ref - where t1.id = t2.id • Extra • using where • Limitation – type • range - where id in (1,2,3,4,5)

Indexes• 2 Star – EXPLAIN • Extra • using where • Using index • And Not • Using filesort • Using temporary • Limitation • Using temporary - query contains different GROUP BY and ORDER BY columns

Indexes• 3 Star – EXPLAIN • Type • index • Extra • Using index • Using index for group-by

Regular UsageSELECT , FROM ₸ WHERE = 121; (PRIMARY KEY )

Range Scan SELECT ▲, ■ FROM ₸ WHERE = 121AND BETWEEN 1 AND 100 KEY ( , )

Range ScanSELECT ▲, ■ FROM ₸ WHERE = 121AND IN (1,100,30,7) KEY ( , )

Covering IndexSELECT , FROM ₸ WHERE = 121 KEY ( , , )

Not OptimalSELECT FROM ₸ WHERE = 121AND ♪ IN (1,100,30,7) KEY ( , )

Broken RangeSELECT ▲, ■ FROM ₸ WHERE = 121AND IN (1,100,30,7); KEY ( , )

Sub Queries SELECT ▲, ■ FROM ₸WHERE IN (SELECT ♪ FROM ♠) KEY ( )

Indexes for SortingSELECT ☼, ☺ FROM ₸ WHERE = 121 GROUP BY ORDER BY KEY ( )

Indexes for SortingSELECT ☼, ☺ FROM ₸ WHERE = 121 GROUP BY ORDER BY KEY ( , ) or KEY ( , )

Indexes for Joins SELECT . , . FROMINNER JOIN ON . = . WHERE . = 232 KEY ( ) or KEY( )

WHERE ₸.Ω = 232; INNER JOIN ♫

FULL SCAN ♫ INNER JOIN ₸ FILTER Ω = 232

SELECT ▲, (SELECT .. FROM WHERE ) FROM ₸ WHERE ♪ IN (SELECT ♪ FROM ♠ WHERE..); “I need help optimizing the my.cnf”

Clustered PK and Secondary Indexes KEY ( , , ) PRIMARY KEY ( ) KEY ( , ) (Can be used in GROUP / ORDER BY SELECT variables to make Covering index)http://www.dbasquare.com/2012/05/17/can-mysql-use-primary-key-values-from-a-secondary-index/

Index Merge SELECT ☺ FROM ₸WHERE =1 OR =2; KEY ( ) KEY ( )

Index Merge SELECT☺FROM ₸ WHERE =1UNION (SELECT☺FROM ₸ WHERE =2) KEY ( ) KEY ( )* 5.6

Gathering DataWhat you will need:1. MySQL Slow log: MySQL >= 5.1 or Percona/MariaDB microslow patch2. Set long_query_time = 0 - for 6 to 24 hours for decent sized slow log. (Make sure host has enough space)

Log ProcessingWorst Response Queries1. Echo ‘’ > slow.log2. mysql> set global long_query_time=0; set long_query_time = 0; flush logs;3. Wait X hours and return original value.4. pt-query-digest slow.log > slow.txt Processing should beBulky Queries done on another host--filter ‘($event->{Rows_examined} > 1000)’Write Queries--filter '($event->{Rows_affected} > 0)‘

Log Processing• MySQL 5.6 • Statement Digest • No need for log processing to get Digest.

Worst Response Queries pt-query-digest slow.log > slow.txtRank Response time Calls R/Call Item SELECT dp_node dp_usernode 1 480.9273 16.3% 600 0.8015 dp_buddylist dp_users dp_node_access 2 322.4220 4.3% 129258 0.0025 ADMIN INIT DB 3 314.8719 4.2% 30220 0.0104 UPDATE dp_users 4 287.7109 3.8% 51606 0.0056 SET SELECT dp_node dp_usernode 5 269.3434 3.6% 600 0.4489 dp_buddylist dp_users dp_node_access 6 238.8571 3.1% 2902141 0.0001 SELECT dp_url_alias

mysql tables in use 4, locked 25289 lock struct(s), heap size 620984, 273785 row lock(s), undo log entries363312MySQL thread id 467, OS thread handle 0x7fceab7df700, query id 88914423 SELECT r.exchange_rate INTO destination_exchange_rate FROM exchange_rate AS r WHERE r.currency_id = NAME_CONST('destination_currency_id',6) AND r.date = NAME_CONST('day',_latin1'2012-06-30' COLLATE 'latin1_swedish_ci')Trx read view will not see trx with id >= 10ECC92B0, sees < 10ECC916FTABLE LOCK table `currency` trx id 10ECC90CC lock mode ISRECORD LOCKS space id 0 page no 261 n bits 80 index `fk_currency_status1` oftable `currency` trx id 10ECC90CC lock mode STABLE LOCK table `daily_summary` trx id 10ECC90CC lock mode ISRECORD LOCKS space id 0 page no 34829580 n bits 200 index `PRIMARY` of table`daily_summary` trx id 10ECC90CC lock mode STABLE LOCK table `exchange_rate` trx id 10ECC90CC lock mode ISTOO MANY LOCKS PRINTED FOR THIS TRX: SUPPRESSING FURTHER PRINTS

*** (1) TRANSACTION:TRANSACTION 13DCDF4D9, ACTIVE 0 sec starting index readmysql tables in use 1, locked 1LOCK WAIT 3 lock struct(s), heap size 1248, 2 row lock(s)MySQL thread id 2438176, OS thread handle 0x7f9a37408700, query id118341815748UPDATE sys_doctrine_lock_trackingSET timestamp_obtained = '1341839053' WHERE object_key = '1146' AND user_ident = '158' AND c_type = '137'*** (1) WAITING FOR THIS LOCK TO BE GRANTED:RECORD LOCKS space id 0 page no 48627 n bits 280 index `PRIMARY` of table`sys_doctrine_lock_tracking` trx id 13DCDF4D9 lock_mode X locks rec but notgap waitingRecord lock, heap no 207 PHYSICAL RECORD: n_fields 6; compact format; 0: len 8; hex 43616d706169676e; 1: len 4; hex 31313436; asc 1146;; 2: len 6; hex 00013dc12d7a; asc = -z;; …

Bottlenecks• Locking Queries • Try to make them complete as fast as possible • JOINs vs sub query • Function wrapped around index • UDF with SQL inside • Long Transactions • Looping with short queries

Misbehaving Optimizer• Optimizer Hints: • USE INDEX • FORCE INDEX • IGNORE INDEX • STRIGHT_JOIN• Joins • LEFT sometimes faster then INNER* Too many indexes confuse the optimizer

Bottlenecks• Virtualization • Increase (obscenely) innodb_log_file_size • Needs restart + deleting old log files • EXT3• General I/O improvements • innodb-flush-log-at-trx-commit • Sync binlog Group Commit • Xa support

Bottlenecks• General I/O improvements • Percona server • Better flushing to disk • Less mutexes • Upgrade MySQL • Same reasons as above • Innodb-io-capacity http://www.wmarow.com/strcalc/

Database Upgrades• My Secret Sauce for smooth MySQL Migrations 1. Upgrade the dev/staging DBs with desired version 2. Wait 1-2 months till silky-smooth • All features have been tested • All the query issues have been fixed 3. Upgrade servers • Down to Up – Slaves first

Bottlenecks• Mutexes • Query Cache • “Freeing items” in processlist• Network Batch Processes • Skip-name-resolve • net_write_timeout / read_timeout • thread_cache_size Lots of connections

Homework• Haven’t talked fully about: SQL and EXPLAIN• Webinars • Indexes - percona.tv/percona-webinars/tools-and- techniques-for-index-design • Explain - percona.tv/percona-webinars/explain-demystified• Websites • http://www.myxplain.net

Columns Partition Partition Partition PartitionAlgorithm Partition Rows Partition Partition Partition Partition

Select Algorithm Inserting Parallelize Data Unique Key Algorithm Reduce DataForeign Keys Issues Use Cases Primary Key Overhead Partitioning ID Benefits Automatic TimeB-TreeLevels Manual Hash Short Scans Time 80-90% Archive DB Usage Shards

Partitioning Parallelize Data Reduce Data• Use Cases Use Cases • Reducing Data – Only get the partition/table that you need • Parallelizing Data - Get an equal amount of data from each partition in parallel• Benefits Benefits • Shorter table scans B-Tree • Less levels for index scans Levels Short Scans

Issues• Algorithm Select Inserting Algorithm • INSERTs Algorithm • SELECTs Issues• Keys Unique Key • Foreign Key Foreign Keys • Unique Key Primary Key Overhead • Primary Key Overhead • Increases Table size • Can change indexes

Partition Types• Range• List ID• Hash Automatic Time• Key• Columns Hash• Sub-partitioning

Partitioning by Usage vs Partitioning by MaintenanceRank Response time Calls Item SELECT address FROM WHERE 1 480.9273 26.3% 129258 BETWEEN ‘2012-11-10’ and ‘2012-11- 17’ SELECT total FROM WHERE 2 322.4220 14.3% 600 BETWEEN ‘2012-11-01’ and ‘2012-11-30’ UPDATE SET active=1 WHERE = 3 34.8719 4.2% 30220 17635376 SELECT dispatch_time FROM 4 28.7109 3.8% 51606 WHERE = 7387612

CREATE TABLE orders id int unsigned not null auto_increment, `date` date not null, … Can also doPRIMARY KEY (date, id), PRIMARY KEY (id, date)KEY id (id), KEY date (date)..) ENGINE=InnoDB DEFAULT CHARSET=utf8PARTITION BY ( (date))(PARTITION VALUES LESS THAN ( (‘2012-01-01’),PARTITION VALUES LESS THAN ( (‘2012-06-01’)),PARTITION VALUES LESS THAN )

Manual Partitioning• Archive - Main table & Archive Table• Time – Create table per year, per month..• Shards – Table per country * Foreign keys Manual Time Archive Shards

Intensive Table Optimization Once upon a time, I was researchingways to make a database working set fit asmuch as possible to memory…

mysqlperformanceblog.com/2010/04/08/fast-ssd-or-more-memory/

Intensive Table Optimization1. People are usually very liberal with data type sizes2. There were (usually) so many indexes that: 1. They multiplied the table size 2. Were not efficient compares to how the table is used 3. Confused the optimizer3. Discovered partitions were not used or misused 1. Discovered sub partitions 2. Primary Key alignment4. Discovered InnoDB compression

Intensive Table OptimizationThere are tools to help with this, but…

Intensive Table Optimization Data Types Slowest Queries BottleneckUser Statistics Optimizations Tables Optional Table Sizes Query Logs Partitions Indexes Foreign Keys

Gathering DataWhat you will need:1. MySQL Slow log: MySQL >= 5.1 or Percona/MariaDB microslow patch2. Set long_query_time = 0 - for 6 to 24 hours for decent sized slow log. (Make sure host has enough space) Slowest Queries Bottleneck Tables

Gathering DataHelpful (Optional):1. Percona/Mariadb user_statistics patch2. Get list of most read/written tables3. Get list of used and un-used indexes4. List of largest tables Table Sizes User Statistics Bottleneck Tables

Worst Response Queries pt-query-digest slow.log > slow.txtRank Response time Calls Item 1 8589.9513 27.5% 231051 UPDATE dp_users 2 4752.6688 15.2% 257235 SELECT dp_cache_menu 3 1606.4946 5.1% 183542 SELECT community_chats 4 1418.9034 4.5% 259939 SELECT dp_cache 5 564.3305 1.8% 7970165 SELECT dp_url_alias 6 495.0092 1.6% 44940 SELECT dp_event dp_node

Table StatisticsSELECT table_name,FROM information_schema.table_statisticsORDER BY DESC LIMIT 5; table_name rows_read dp_users 2302477894ROWS_CHANGED dp_node 1231318439ROWS_CHANGED_X_INDEXES dp_comments 1071462211 dp_userpoints 1033073070 dp_search_index 260154684

Table Statistics Worst Response Tables --group-by tablesRank Response time Calls Item 1 7975.4487 6.5% 124384 advertisement 2 5554.1435 4.5% 1834 info 3 4915.4816 4.0% 208 placement 4 4902.7644 4.0% 158 advert_summary

Table Sizes Table_Name Rows Data Idx Total_size Idxfractotal_daily_summary 610M 77G 88G 165G 1.15advert_summary 478M 57G 45G 102G 0.78log_messages 92M 47G 10G 57G 0.21SELECT CONCAT(TABLE_SCHEMA, '.', TABLE_NAME) AS TABLE_NAME, CONCAT(ROUND(TABLE_ROWS / 1000000, 2), 'M') ROWS, CONCAT(ROUND(DATA_LENGTH / ( 1024 * 1024 * 1024 ), 2), 'G') DATA, CONCAT(ROUND(INDEX_LENGTH / ( 1024 * 1024 * 1024 ), 2), 'G') IDX, CONCAT(ROUND(( DATA_LENGTH + INDEX_LENGTH ) / ( 1024 * 1024 * 1024 ), 2), 'G') TOTAL_SIZE, ROUND(INDEX_LENGTH / DATA_LENGTH, 2) IDXFRACFROM INFORMATION_SCHEMA.TABLESORDER BY DATA_LENGTH + INDEX_LENGTH DESC http://www.mysqlperformanceblog.comLIMIT 10; /2008/03/17/researching-your-mysql-table-sizes/

Which table needs your attention?Table Size

Intensive Table Optimization• Table Targeting • The most “worthy” table to focus your attention on • Biggest bang for your buck• If you know which table is the most troublesome • Ignore most of the investigations • Apart from slow log • Investigations help understand DB usage

Optimizations Compression Data Types Query Logs Partitions IndexesSub Partitioning Don’t Need Foreign Keys

Intensive Table Optimization• Datatypes • SELECT * FROM table G • Example: Tinyint instead of Bigint: (7 bytes row + 7bytes index) * 350million rows = 4.9Gb • Enum instead of Varchar • Remove NULLs when not needed

Intensive Table Optimization• Compression • Best for tables with a lot of varchar/text • Compress table by x2, x4, x8.. • Need to experiment with innodb_strict = on; • On my tests (5.5) – Very very slow • Alter tables • INSERTS/UPDATES/DELETES Optimizations Compression

1. Slow Log Filtered by Get Data Target Table Query Digest New Results EXPLAIN 3. 2. Index-Usage Make Test Assumptions Assumptions 4. Deploy

Target Table ProcessingFilter Log:pt-query-digest slow.log--filter '$event->{arg} =~ m/dp_users /'--no-report --print >dp_users.logWorst Queries from new log:pt-query-digest dp_users.log --limit 100%>tbl_dp_users.txt

ResponseRank time Calls Item 209.2863 UPDATE dp_users SET access = 133******3 WHERE 1 88850 10.7% = 23****01G 162.2711 3 1309010 SELECT access FROM dp_users WHERE = 21***4G 8.3% 139.9009 SELECT uid, name FROM dp_users WHERE =1 4 197 7.1% ORDER BY DESCG 133.8691 SELECT * FROM dp_users u WHERE = 5 327 6.8% 's******s'G SELECT name, created, picture FROM dp_users WHERE 109.6903 6 29152 picture !='' AND = '1' AND BETWEEN 5.6% '133*****0' AND '133*****60'G 92.9095 SELECT dp_node dp_users using ( ) 7 360642 4.7% dp_node_revisions 74.2426 SELECT * FROM dp_users u WHERE = hoa****rio' 8 106 3.8% AND = '3837********5f9b' AND = 1G

Partitioning by UsageRank Response time Calls Item SELECT address FROM orders WHERE date 1 480.9273 26.3% 129258 BETWEEN ‘2012-11-10’ and ‘2012-11-17’ SELECT total FROM orders WHERE date 2 322.4220 14.3% 600 BETWEEN ‘2012-11-01’ and ‘2012-11-30’ UPDATE order SET active=1 WHERE id = 3 34.8719 4.2% 30220 17635376 SELECT dispatch_time FROM order WHERE 4 28.7109 3.8% 51606 id = 7387612

Testing AssumptionsSELECT uid, name FROM WHERE = 1 ORDER BY DESCG1.30secsSELECT uid, name FROMWHERE = 1 ORDER BYDESCG0.56secs Query Digest EXPLAIN

Test Environment• Hardware environment similar to live• Data size similar to live environment: • Replicating slave • Cannot change datatypes on MIXED/ROW replication • Create table2 and run queries against it • Xtrabackup – full replica New Results • Script with Mysqldump + WHERE • mysqldump --databases main --tables table1 table2 –where “date > now() – interval 30 day” > dump.sql • Mysqldump –all-database –ignore-table main.table1 main.table2 >> dump.sql

Final Tweaking(Remember the table log file – dp_users.log ?)pt-index-usage• pt-index-usage dp_users.log --host 127.0.0.1 --tables dp_users >idx_dp_users.txt• Go over recommendations Index-Usage Test Assumptions

Deploy Strategies1. Rolling Servers2. pt-online-schema-change Deploy3. Two-part move a. Create new table – table2 b. Insert table rows that will not change – INSERT INTO table2 SELECT * FROM table1 WHERE date <= curdate() – interval 30 day; c. Short downtime d. Rename table1 to table3; rename table2 to table1; e. INSERT IGNORE INTO table1 SELECT * FROM table3 WHERE date >= curdate() – interval 30 day;4. Alter table – long downtime (pre 5.6, maybe)

Slow Log Filtered byGet Data Target Table Query Digest EXPLAIN New Results Index-Usage Test Make Assumptions Assumptions Deploy

Streaming Reverse File Proxy VolatileBrowser Shield Cache Query Cache Page Cache Denormalize 2nd Level Summary Column Tables Cache 3rd Level Subtotal Attributes Data Warehouse Conditional

Read Cache• Outside the database • Page Cache • Query Cache• Inside the database • Column Cache • Summary Table* Complexity

Page Cache• Browser Cache • Etag, Expires, Last-modified• Reverse Proxy • Squid, Varnish, Nginx, Apache, Proprietary.• File/Full page cache • mod_file_cache, Zend_Cache_Backend • W3 Total Cache, sfSuperCache* Stale

memegenerator.net/instance/23247230

Query Cache• Volatile • Memcached, Redis, Hibernate Cache, Arrays.. • On-Request, Time-to-Live, Stale and Cache Stampede• Streaming • Interval / Async, Stale, Common Queries• Shield – Mongo Shield • Script/Tool Replication, Dependency • Aggregation • Complexity / Layers

Mongo Shield 147cmimg.photobucket.com/albums/v158/keris_hanuman/Afbeelding1455.jpg

Sticky StickySessions SessionsMemcached Memcached Cart MySQL

Memcached Sessions MySQL Cart

Manipulating Time* Error Handling

Column Cache• Denormalize • Additional Column(s) to prevent JOINs • Maintenance, Space on disk • Example: CustomerID, OrderID, OrderItemID• Sub Total • Prevent additional slow GROUP BY queries • Maintenance, Generation, Space on disk • Example: totalPurchases, moneyOwed* Space vs Speed

Column Cache• Conditional • Store conditional (True/False) logic • Prevents recalculating result – another query • Can prevent rewriting code • Example: isDone, hasReview, aboveAvg• Attributes • ENUM datatype • SET datatype - ARRAY of options • Prevents JOINs • May save space

Summary TablesAn additional table which consistsof an aggregation of another table orseveral JOIN’d tables. Summary Tables

SELECT ...FROM main_table t1 INNER JOIN table2 t2 on t1.orderid = t2.id INNER JOIN table3 t3 on t1.customerid = t3.id INNER JOIN table4 t4 on t1.addressid = t4.id INNER JOIN table5 t5 on t2.supplierid = t5.id INNER JOIN table6 t6 on t2.warehouse = t6.id INNER JOIN table2 t7 on t6.addressid = t7.id INNER JOIN table8 t8 on t1.productid = t8.id INNER JOIN table9 t9 on t1.buyerid = t9.id INNER JOIN table10 t10 on t1.officeid = t10.idWHEREt1.date between '2012-11-01' and '2012-11-30'GROUP BY t1.date

Summary TablesProcessed 1.2million rowsReturned 30 rowsTime 17.52 minutes

Summary TablesCREATE TABLE summary_table(primary key (date,addressid,productid)) asSELECT ...FROM main_table t1INNER JOIN table2 t2 on t1.orderid = t2.idINNER JOIN table3 t3 on t1.customerid =t3.idINNER JOIN table4 t4 on t1.addressid = t4.idINNER JOIN table6 t6 on t2.warehouse = t6.idGROUP BY t1.date, t1.addressid, t1.productid

Summary TablesSELECT ...FROM summary_table t1INNER JOIN table5 t5 on t2.supplierid = t5.idINNER JOIN table2 t7 on t6.addressid = t7.idINNER JOIN table8 t8 on t1.productid = t8.idINNER JOIN table9 t9 on t1.buyerid = t9.idINNER JOIN table10 t10 on t1.officeid = t10.idWHEREt1.date between '2012-11-01' and '2012-11-30‘GROUP BY t1.date

Summary TablesProcessed 35000 rowsReturned 30 rowsTime 0.75 seconds

Summary Tables as anAnalytics Sub-System

Database Design Operational System Analytic System Comparison Execution of a business Measurement of aPurpose process business process Insert, Update, Query,Primary Interaction Query DeleteDesign Optimization Update concurrency High-performance query Entity-relationship (ER) Dimensional design (StarDesign Principle 3rd Normal form (3NF) schema or cube) amazon.co.uk/Schema-Complete-Reference-Christopher-Adamson/

Data Warehouses Fact Tables Dimension Tables• Measurement • Context• Narrow • Wide• Long • Short• Most of the • Filters and descriptive data data

Operational DesignCustomers Orders Order Items Addresses Products

Star SchemaDate dim Address dim OrderItems Fact Products dim Customers dim

Maintenance• Hourly/Daily/Weekly/Monthly Aggregations• Intervals• Off Peak• On-Insert

A lot more InnoDB settings Read Cache Buffer Pool Indexes Sharding Intensive Scaling Reads Table OptimizationGalera Read Partitioning Slaves Better Hardware Sub Partitioning Another Read/Write Master Splitting IO Memory

Scaling Reads• InnoDB Buffer Pool • Cache Warming• Read buffer A lot more settings InnoDB• Sort Buffer Buffer Pool• Join Buffer• Temp Table size / on disk

Scaling Reads • Better Hardware • Disk I/O • Memory BetterHardware IO Memory

Scaling Reads• Read Slaves • Read/Write Splitting • Master/Master • Galera

Server Architecture MySQL

Columnar £££ A lot more Store settings InnoDB Buffer Pool SummaryHadoop Tables Indexes Intensive OLAP Reporting Table Cubes Optimization Sharding Partitioning ReportingCross Shard Better Slaves Joins Hardware Sub Partitioning Different Indexes IO Memory

Reporting• Reporting Slaves • Different Indexes • No Foreign Keys • Partitioning • If ROW-replication: • Must have same data types Reporting Slaves Different Indexes

Reporting• Sharding • Cross-shard JOINs • Go Fish • Aggregations Hadoop

Reporting • Summary tables - aggregations • Scripts • Hadoop • Ready –Made reports • OLAP Cubes • Columnar Store • £££* RDBMS very fast at GROUP BY

Battery A lot more settings RAID Innodb Log Write-back File Size Local Server Write Buffers Storage MySQL ETL settings QueueHadoop Summarized CRUSH HandCode Writes

Write Buffers• RAID card • BBU • Write-back vs write-through • Battery Learning/Drain

Write Buffers• Innodb log file size • Buffer pool * dirty read (%) * io capacity A lot more settings Innodb Log* Virtual Environment File Size

Write Buffers• innodb-flush-log-at-trx-commit• Sync binlog • Support xa MySQL settings

Write Buffers• ActiveMQ, RabbitMQ, ZeroMQ, Gearman • Not ACID, may need redundancy• Summarized Writes • Memcached counters + interval writes

Write Buffers• Local Storage (web/app servers) • Memcached • SQLite (disk/in-memory) • Log file Local Server <- Most popular Storage • MySQL• Independent, isolated ETL• Need to fetch data • Prevent missed data and duplicates

Write Buffers• Fetching Data • Hand code • ETL tool – Pentaho/Talend • Flume• Aggregation/Processing • Hadoop <- Very Popular • Google CRUSH

Innodb Log File Size Write Buffers Sharding Indexes Intensive Scaling Writes Table Hardware Optimization Remove Partitioning BottlenecksOS Settings Sub Partitioning MySQL Bypass SQL Settings Layer

Scaling Writes• Less Indexes • Less writes• Partitioning • Less table maintenance • Less B-tree levels • Less need to organize blocks • Algorithm overhead • Mutex/Locks

Scaling Writes• Bypass SQL Layer SQL Parser • Innodb/Memcached • HandlerSocket Optimizer Storage Engine * Raik

Scaling Writes• IO Scheduler• File System (+ nobarrier, noatime, nodiratime) • EXT 3 • EXT 4 • XFS • ZFS• Block Sizes

Scaling Writes• Faster I/O • Faster Disks • SAS • SSD • RAID + cache Hardware • PCIe SSD • FusionIO • Virident

Scaling Writes• Master/Master • Does not scale writes • Writes still need to replicate• Sharding • Does scale writes

App Global Data Child Data Function DB By SchemaProxy Partitioned Lookup Data Shared Nothing Functional Sharding Partitioning IDGo Fish Reporting Key Area HashCross Shard Hadoop

Sharding• Partitioned Data• Splitting the Data • Vertically • Main Tables to partition • Child Tables • Global Tables• By Schema• Shared Nothing

Sharding• Partitioning by Key • ID – CustomerID, ProductID, App • Area – Country, City, Continent • Hash – Random for equal spread

Sharding• Which shard has the data? • Store it in a DB • Flexible but slower • Some Function in your App • Faster, less flexible • Proxy config file • Faster, less flexible • Needs some app coding

Sharding• Maintenance • Backups • Slaves• Uptime • Loosely coupled system

Sharding• Functional Partitioning • Different Apps • Share some tables Functional Partitioning

Sharding• Reporting • Go Fish – One server / Shared nothing • Cross Shard – Many servers • Hadoop – Aggregate to one reporting server

The End• Questions & Answers• Email: contact@jonathanlevin.co.uk• Don’t forget to rate this tutorial

If we have time• MySQL 5.6• NoSQL• ORM• Beyond Hadoop• Bring-Your-Own-Problems

Movatterモバイル変換

Change Language

Scaling MySQL Strategies for Developers

Embed presentation

Recommended

More Related Content

What's hot

Viewers also liked

Similar to Scaling MySQL Strategies for Developers

Recently uploaded

Scaling MySQL Strategies for Developers