<?xml version="1.0" encoding="utf-8"?>
<launchpad-bug id="714419">
  <date_last_updated>2011-02-07 12:03:11.270404+00:00</date_last_updated>
  <api_links>
    <bug_api_link>https://api.launchpad.net/1.0/bugs/714419</bug_api_link>
    <bug_owner_link>https://api.launchpad.net/1.0/~philip-stoev</bug_owner_link>
    <milestone_link></milestone_link>
    <linked_branches_collection_link>https://api.launchpad.net/1.0/bugs/714419/linked_branches</linked_branches_collection_link>
    <activity_link>https://api.launchpad.net/1.0/bugs/714419/activity</activity_link>
  </api_links>
  <bug_web_link>https://bugs.launchpad.net/bugs/714419</bug_web_link>
  <owner>Philip Stoev</owner>
  <assignee>Kristian Nielsen</assignee>
  <milestone_title></milestone_title>
  <duplicate_link></duplicate_link>
  <duplicate_bug_id></duplicate_bug_id>
  <title>Replication failures with slave provisioning via mysqldump and tables with no PK</title>
  <status>Won't Fix</status>
  <importance>Undecided</importance>
  <created>2011-02-07 07:38:02.003429+00:00</created>
  <description>
<![CDATA[The following test fails in mysql 5.1, maria-5.2 and maria-5.2-rpl, but passes under mysql 5.5 . The test operates as follows:

1. A transactional concurrent UPDATE workload is maintained on the master throughout the test;

2. Around the middle of the test, a new slave is provisioned via mysqldump (port 19304, datadir and logs in mysql-test/var/master-data_clonedslave) and then started;

3. At the end of the test, the new slave is checked via SHOW SLAVE STATUS and dumped and compared to the master

There is a separate slave, on port 19304 that is not provisioned via mysqldump. It replicates properly and is not relevant to this particular bug )

The slave reports either:

110207  9:04:37 [ERROR] Slave SQL: Error 'Table 'test.table1_innodb_int' doesn't exist' on opening tables, Error_code: 1146
110207  9:04:37 [Warning] Slave: Table 'test.table1_innodb_int' doesn't exist Error_code: 1146

even though the table does exist on the slave and the workload does not contain any DDL statements

or

110207  9:37:17 [ERROR] Slave SQL: Could not execute Update_rows event on table test.table100_innodb; Can't find record in 'table100_innodb', Error_code: 1032; handler error HA_ERR_END_OF_FILE; the event's master log master-bin.000001, end_log_pos 7833035, Error_code: 1032

This issue seems to be releated to the more general problem of replicating tables with no PK, even though in this particular test, straightforward replication does work -- what does not work is provisioning a new slave via mysqldump. Extensive fixes for PK-less replication were done for 5.5 .]]>  </description>
  <activities>
    <activity datechanged="2011-02-07T07:38:02.003429+00:00">
      <oldvalue>
<![CDATA[]]>      </oldvalue>
      <newvalue>
<![CDATA[]]>      </newvalue>
      <whatchanged>bug</whatchanged>
      <person>Philip Stoev</person>
      <message>added bug</message>
    </activity>
    <activity datechanged="2011-02-07T08:01:14.575451+00:00">
      <oldvalue>
<![CDATA[]]>      </oldvalue>
      <newvalue>
<![CDATA[ZZ file https://bugs.launchpad.net/maria/+bug/714419/+attachment/1835409/+files/bug714419.zz]]>      </newvalue>
      <whatchanged>attachment added</whatchanged>
      <person>Philip Stoev</person>
      <message></message>
    </activity>
    <activity datechanged="2011-02-07T08:01:33.507207+00:00">
      <oldvalue>
<![CDATA[]]>      </oldvalue>
      <newvalue>
<![CDATA[YY file https://bugs.launchpad.net/maria/+bug/714419/+attachment/1835410/+files/bug714419.yy]]>      </newvalue>
      <whatchanged>attachment added</whatchanged>
      <person>Philip Stoev</person>
      <message></message>
    </activity>
    <activity datechanged="2011-02-07T08:02:45.726388+00:00">
      <oldvalue>
<![CDATA[]]>      </oldvalue>
      <newvalue>
<![CDATA[Kristian Nielsen (knielsen)]]>      </newvalue>
      <whatchanged>maria: assignee</whatchanged>
      <person>Philip Stoev</person>
      <message></message>
    </activity>
    <activity datechanged="2011-02-07T08:03:17.534911+00:00">
      <oldvalue>
<![CDATA[Replication failures with slave provisioning via mysqldump under UPDATE workload]]>      </oldvalue>
      <newvalue>
<![CDATA[Replication failures with slave provisioning via mysqldump and tables with no PK]]>      </newvalue>
      <whatchanged>summary</whatchanged>
      <person>Philip Stoev</person>
      <message></message>
    </activity>
    <activity datechanged="2011-02-07T11:50:27.748311+00:00">
      <oldvalue>
<![CDATA[New]]>      </oldvalue>
      <newvalue>
<![CDATA[Won't Fix]]>      </newvalue>
      <whatchanged>maria: status</whatchanged>
      <person>Kristian Nielsen</person>
      <message></message>
    </activity>
  </activities>
  <comments>
    <comment commentlink="https://api.launchpad.net/1.0/maria/+bug/714419/comments/1" datecreated="2011-02-07T08:01:14.575451+00:00">
      <person>Philip Stoev</person>
      <subject>
<![CDATA[Re: Replication failures with slave provisioning via mysqldump under UPDATE workload]]>      </subject>
      <content>
<![CDATA[]]>      </content>
    </comment>
    <comment commentlink="https://api.launchpad.net/1.0/maria/+bug/714419/comments/2" datecreated="2011-02-07T08:01:33.507207+00:00">
      <person>Philip Stoev</person>
      <subject>
<![CDATA[Re: Replication failures with slave provisioning via mysqldump under UPDATE workload]]>      </subject>
      <content>
<![CDATA[]]>      </content>
    </comment>
    <comment commentlink="https://api.launchpad.net/1.0/maria/+bug/714419/comments/3" datecreated="2011-02-07T08:02:33.900432+00:00">
      <person>Philip Stoev</person>
      <subject>
<![CDATA[Re: Replication failures with slave provisioning via mysqldump under UPDATE workload]]>      </subject>
      <content>
<![CDATA[To reproduce, branch a fresh version of RQG using "bzr branch lp:randgen" and then run:

 perl runall.pl \
--gendata=bug714419.zz \
--rpl_mode=row \
--duration=60 \
--queries=1000000000 \
--threads=10 \
--validator=None \
--basedir=/home/philips/bzr/maria-5.2-rpl \
--engine=InnoDB \
--mysqld=--default-storage-engine=Innodb \
--grammar=bug714419.yy 
--reporter=CloneSlave


]]>      </content>
    </comment>
    <comment commentlink="https://api.launchpad.net/1.0/maria/+bug/714419/comments/4" datecreated="2011-02-07T12:03:09.307680+00:00">
      <person>Kristian Nielsen</person>
      <subject>
<![CDATA[Re: Replication failures with slave provisioning via mysqldump and tables with no PK]]>      </subject>
      <content>
<![CDATA[I managed to repeat locally, and investigated the issue.
I started manually the cloned slave after the failure in the
master-data_clonedslave directory like this:

    cd mysql-test/var/master-data_clonedslave
    /home/knielsen/my/5.2/work-5.2-rpl/sql/mysqld --no-defaults --datadir=$(pwd) --character-sets-dir=/home/knielsen/my/5.2/work-5.2-rpl/sql/share/charsets --language=/home/knielsen/my/5.2/work-5.2-rpl/sql/share/english/ --skip-slave-start --relay-log=clonedslave-relay

It fails on this query (in my particular case):

    /* QUERY_IS_REPLICATION_SAFE */ UPDATE `table1_innodb` SET `col_varchar_255_utf8` = REPEAT(LPAD(CONNECTION_ID(), 2, ' '), 2 ) , `col_bigint` = REPEAT(LPAD(CONNECTION_ID(), 2, ' '), 3 )

What I see is that the row-based update event corresponding to this has the
value 1212121212121212 for column `col_double_key`. While the table on the
cloned slave has the value 1212121212121210. This causes the event to fail to
apply.

I think this is a fundamental problem with using mysqldump to save and restore
floating-point numbers. The problem is that the textual decimal representation
used by mysqldump is not precise:

    > select col_double_key from table1_innodb;
 1.21212121212121e+15

This output is not precise enough to distinguish the two different values on
the master and cloned slave.

In MySQL 5.5 they use a better library for printing floating-point
numbers. Maybe that provides a textual decimal representation for mysqldump
with sufficient precision, which could be the reason the bug is not
reproducible there.

I do not think we should fix this in 5.1/5.2.

A work-around to get the RQG test to not fail could be to only use values for
double columns which are integers with at most 14 digits, as I believe the
textual decimal representation is precise for such numbers.
]]>      </content>
    </comment>
  </comments>
  <messages>
    <message created="2011-02-07 08:01:14.575451+00:00" owner="Philip Stoev">
<![CDATA[]]>      <attachment link="https://bugs.launchpad.net/bugs/714419/+attachment/1835409" type="Unspecified">
        <title>ZZ file</title>
        <file>LPexportBug714419_bug714419.zz</file>
      </attachment>
    </message>
    <message created="2011-02-07 08:01:33.507207+00:00" owner="Philip Stoev">
<![CDATA[]]>      <attachment link="https://bugs.launchpad.net/bugs/714419/+attachment/1835410" type="Unspecified">
        <title>YY file</title>
        <file>LPexportBug714419_bug714419.yy</file>
      </attachment>
    </message>
  </messages>
</launchpad-bug>
