- While I’ve read many related articles before, true understanding comes from practice.
- Recently, the data team restructured logging formats. I designed a table schema and asked them to write cleaned client logs into it.
- As this is a video playback statistics table, we implemented date-based partitioning.
- With less than 100k videos in the library, we opted for monthly partitioning instead of daily.
For auto-increment primary key id
, you must create composite primary key with partitioning key
- In our actual implementation, we simply omitted the primary key
`id` BIGINT UNSIGNED NOT NULL auto_increment,
PRIMARY KEY(`id`, `date_key`)
Partition by month
- Other playback statistics fields are omitted
- Records before 2021-09-01 will be stored in p_202108 partition
- Records from 2021-09-01 to 2021-09-30 will go to p_202109 partition
CREATE TABLE `video_play_logs` (
`video_id` INT UNSIGNED NOT NULL,
`date_key` date NOT NULL,
index `video_play_logs_video_id_index` ( `video_id` )
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
PARTITION BY RANGE(to_days(`date_key`)) (
PARTITION p_202108 VALUES LESS THAN (to_days('20210901')),
PARTITION p_202109 VALUES LESS THAN (to_days('20211001')),
PARTITION p_202110 VALUES LESS THAN (to_days('20211101')),
PARTITION p_202111 VALUES LESS THAN (to_days('20211201')),
PARTITION p_202112 VALUES LESS THAN (to_days('20220101'))
)
Verify partition configuration
- Insert test data and check TABLE_ROWS for record counts in each partition
SELECT
TABLE_NAME,
PARTITION_NAME,
TABLE_ROWS
FROM
information_schema.`PARTITIONS`
WHERE
table_name='video_play_logs'
ORDER BY TABLE_ROWS desc
+-----------------+----------------+------------+
| TABLE_NAME | PARTITION_NAME | TABLE_ROWS |
+-----------------+----------------+------------+
| video_play_logs | p_202108 | 0 |
| video_play_logs | p_202109 | 0 |
| video_play_logs | p_202110 | 0 |
| video_play_logs | p_202111 | 0 |
| video_play_logs | p_202112 | 0 |
| video_play_logs | p_202201 | 0 |
| video_play_logs | p_202202 | 0 |
| video_play_logs | p_202203 | 0 |
| video_play_logs | p_202204 | 0 |
......
Check partition hit status
- When
partitions
column shows data, it indicates successful partition pruning
EXPLAIN PARTITIONS SELECT * FROM video_play_logs where date_key='2021-08-27';
+----+-------------+-----------------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | video_play_logs | p_202109 | ALL | NULL | NULL | NULL | NULL | 1 | 100.00 | Using where |
+----+-------------+-----------------+------------+------+---------------+------+---------+------+------+----------+-------------+
1 row in set (0.16 sec)