Not PHP specific (as others have mentioned, "advanced" MySQL knowledge should be language-independent), but here you go (from this question and this question):
Learn how to design schemas, indexes, queries and advanced MySQL features for maximum performance, and get detailed guidance for tuning your MySQL server, operating system, and hardware to their fullest potential. You'll also learn practical, safe, high-performance ways to scale your applications with replication, load balancing, high availability, and failover.
Topics include transaction processing and indexing theory, benchmarking and profiling, and advanced coverage of storage engines, data types, subqueries, derived tables, and joins. Also covers MySQL 5's new enterprise features like stored procedures, triggers, and views.
(partial descriptions from Amazon included, see respective product page for more detailed info).
[root@*****]# perror 28
OS error code 28: No space left on device
[root@*****]# perror 32
OS error code 32: Broken pipe
Since the mysqldump keeps breaking at random places, it is space-related, and no disk full condition, I would suspect the problem at a deeper layer : the MySQL Packet. What is a MySQL Packet?
MySQL network communication code was
written under the assumption that
queries are always reasonably short,
and therefore can be sent to and
processed by the server in one chunk,
which is called a packet in MySQL
terminology. The server allocates the
memory for a temporary buffer to store
the packet, and it requests enough to
fit it entirely. This architecture
requires a precaution to avoid having
the server run out of memory---a cap
on the size of the packet, which this
option accomplishes.
The code of interest in relation to
this option is found in
sql/net_serv.cc. Take a look at my_net_read(), then follow the call to my_real_read() and pay
particular attention to
net_realloc().
This variable also limits the length
of a result of many string functons.
See sql/field.cc and
sql/intem_strfunc.cc for details.
Given this explanation, making bulk INSERTs will load/unload a MySQL Packet rather quickly. This is especially true when max_allowed_packet is too small for the given load of data coming at it.
I wrote about this before : MySQL server has gone away obstructing import of large dumps
Try raising max_allowed_packet for the mysqldump to 1G as follows:
mysqldump --max-allowed-packet=1073741824 ...
and try the mysqldump.
If this does not do it, then do this:
Added this to my.cnf
[mysqld]
max_allowed_packet = 1G
Then, login to MySQL as root@localhost and run this
mysql> SET GLOBAL max_allowed_packet = 1024 * 1024 * 1024;
The information you have in the question concerning MyISAM is right on target. However, I would like to address your two additional questions:
LATEST QUESTION
What if users update existed data with longer data? Will MyISAM marked the record as deleted and find place that fits the new data or simply use overflow pointer to point to unfitted data?
For records with variable length, the format is more complicated. The first byte contains a special code describing the subtype of the record. The meaning of the subsequent bytes varies with each subtype, but the common theme is that there is a sequence of bytes that contains the length of the record, the number of unused bytes in the block, NULL value indicator flags, and possibly a pointer to the continuation of the record if the record did not fit into the previously created space and had to be split up. This can happen when one record gets deleted, and a new one to be inserted into its place exceeds the original one is size. You can get the details of the meanings of different codes by studying the switch statement in_mi_get_block_info() in storage/myisam/mi_dynrec.c.
Based on that paragraph, the old record gets overwritten with linkage data only if the new data to insert cannot fit in the previously allocated block. This can result in many bloated rows.
ADDITIONAL QUESTION
Would it be very inefficient if the table has been deleted and inserted for many times since the record structure could potentially full of overflow pointers and unused space?
From my previous answer, there would be lots of blocks that have
block of space
the length of the record
the number of unused bytes in the block
NULL value indicator flags
possibly a pointer to the continuation of the record if the record did not fit into the previously created space and had to be split up
Such record links would start in the front of every row that have oversized data being inserted. This can bloat a MyISAM tables .MYD file very quickly.
SUGGESTIONS
The default row format of a MyISAM is Dynamic. When a table is Dynamic and experiences lots of INSERTs, UPDATEs, and DELETEs, such a table would need to optimized with
OPTIMIZE TABLE mytable;
There is an alternative: switch the table's row format to Fixed. That way, all rows are the same size. This is how you make the row format Fixed:
ALTER TABLE mytable ROW_FORMAT=Fixed;
Even with a Fixed Row Format, time must be taken to locate an available record but the time would be O(1) search time (In layman's terms, it would take the same amount of time to locate an available record no matter how many rows the table has or how many deleted rows there are). You could bypass that step by enabling concurrent_insert as follows:
Add this to my.cnf
[mysqld]
concurrent_insert = 2
MySQL restart not required. Just run
mysql> SET GLOBAL concurrent_insert = 2;
This would cause all INSERTs to go to the back of the table without looking for free space.
Advantage of Fixed Row tables
INSERTs, UPDATEs, and DELETEs would be somewhat faster
SELECT are 20-25% faster
Here are some of my posts on SELECT being faster for Row Formats being Fixed
May 03, 2012 : Which is faster, InnoDB or MyISAM?
Sep 20, 2011 : Best of MyISAM and InnoDB
May 10, 2011 : What is the performance impact of using CHAR vs VARCHAR on a fixed-size field?
Disadvantage of Fixed Row tables
In most cases, when you run ALTER TABLE mytable ROW_FORMAT=Fixed;, the table may grow 80-100%. The .MYI file (index pages for the MyISAM table) would also grow at the same rate.
EPILOGUE
If you want speed for MyISAM tables and can live with bigger tables, my alternate suggestions would be needed. If you want to conserve space for each MyISAM table, leave the row format as is (Dynamic). You will have to compress the table with OPTIMIZE TABLE mytable; more frequent with Dynamic tables.
Not PHP specific (as others have mentioned, "advanced" MySQL knowledge should be language-independent), but here you go (from this question and this question):
Understanding MySQL Internals :
High Performance MySQL:
Pro MySQL:
(partial descriptions from Amazon included, see respective product page for more detailed info).
Seems strange
Since the mysqldump keeps breaking at random places, it is space-related, and no disk full condition, I would suspect the problem at a deeper layer : the MySQL Packet. What is a MySQL Packet?
According to the page 99 of the Book
here are paragraphs 1-3 explaining it:
Given this explanation, making bulk INSERTs will load/unload a MySQL Packet rather quickly. This is especially true when max_allowed_packet is too small for the given load of data coming at it.
I wrote about this before : MySQL server has gone away obstructing import of large dumps
Try raising max_allowed_packet for the mysqldump to
1G
as follows:and try the mysqldump.
If this does not do it, then do this:
Added this to my.cnf
Then, login to MySQL as
root@localhost
and run thisand try the mysqldump.
Give it a Try !!!
The information you have in the question concerning MyISAM is right on target. However, I would like to address your two additional questions:
LATEST QUESTION
According to the Book
Chapter 10 : "Storage Engines" Page 196 Paragraph 7 says
Based on that paragraph, the old record gets overwritten with linkage data only if the new data to insert cannot fit in the previously allocated block. This can result in many bloated rows.
ADDITIONAL QUESTION
From my previous answer, there would be lots of blocks that have
Such record links would start in the front of every row that have oversized data being inserted. This can bloat a MyISAM tables
.MYD
file very quickly.SUGGESTIONS
The default row format of a MyISAM is Dynamic. When a table is Dynamic and experiences lots of INSERTs, UPDATEs, and DELETEs, such a table would need to optimized with
There is an alternative: switch the table's row format to Fixed. That way, all rows are the same size. This is how you make the row format Fixed:
Even with a Fixed Row Format, time must be taken to locate an available record but the time would be O(1) search time (In layman's terms, it would take the same amount of time to locate an available record no matter how many rows the table has or how many deleted rows there are). You could bypass that step by enabling concurrent_insert as follows:
Add this to my.cnf
MySQL restart not required. Just run
This would cause all INSERTs to go to the back of the table without looking for free space.
Advantage of Fixed Row tables
Here are some of my posts on SELECT being faster for Row Formats being Fixed
May 03, 2012
: Which is faster, InnoDB or MyISAM?Sep 20, 2011
: Best of MyISAM and InnoDBMay 10, 2011
: What is the performance impact of using CHAR vs VARCHAR on a fixed-size field?Disadvantage of Fixed Row tables
In most cases, when you run
ALTER TABLE mytable ROW_FORMAT=Fixed;
, the table may grow 80-100%. The.MYI
file (index pages for the MyISAM table) would also grow at the same rate.EPILOGUE
If you want speed for MyISAM tables and can live with bigger tables, my alternate suggestions would be needed. If you want to conserve space for each MyISAM table, leave the row format as is (Dynamic). You will have to compress the table with
OPTIMIZE TABLE mytable;
more frequent with Dynamic tables.