有两张表:data_log_1、data_log_2,数据插入时间不是同步的,通过字段user、phone_number做关联,然后将data_log_1中v_id 字段数据更新到 data_log_2 v_id字段中。
因为两表中的user、phone_number数据会有重复,目前只能通过时间字段来做区分,但两表插入时间不一致,差值也不是完全确定在某范围内。
弄了半天没找到完美的语句写法可以找到两表中最匹配的数据,麻烦各位SQL高手相帮了!
结构如下:
CREATE TABLE `data_log_1` (
`r_id` int(10) unsigned NOT NULL auto_increment,
`call_time` datetime default NULL,
`user` varchar(20) default NULL,
`v_id` varchar(30) default NULL,
`phone_number` varchar(18) default '',
PRIMARY KEY (`r_id`),
KEY `user` (`user`),
KEY `phone_number` (`phone_number`)
) ENGINE=MyISAM AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;
insert into data_log_1(call_time,user,v_id,phone_number)
values ('2012-11-20 08:46:51','1173','1353372410.254453','19056866388'),('2012-11-20 08:47:19','1173','1353372439.254467','19056866388'),
('2012-11-20 08:47:37','1173','1353372457.254475','19056866388'),('2012-11-20 08:49:18','1173','1353372557.254517','19056866388'),
('2012-11-20 08:51:33','1173','1353372692.254549','19056866388'),('2012-11-20 08:53:58','1173','1353372837.254584','19056866388'),
('2012-11-20 08:55:11','1173','1353372910.254617','19056866388'),('2012-11-20 08:56:25','1173','1353372984.254644','19056866388'),
('2012-11-20 08:58:07','1173','1353373087.254693','19056866388'),('2012-11-20 08:58:58','1173','1353373138.254726','19056866388');
CREATE TABLE `data_log_2` (
`r_id` int(10) unsigned NOT NULL auto_increment,
`start_time` datetime default NULL,
`user` varchar(20) default NULL,
`v_id` varchar(30) default NULL,
`phone_number` varchar(18) default '',
PRIMARY KEY (`r_id`),
KEY `user` (`user`),
KEY `phone_number` (`phone_number`)
) ENGINE=MyISAM AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;
insert into data_log_2(start_time,user,v_id,phone_number)
values ('2012-11-20 08:46:51','1173','','19056866388'),('2012-11-20 08:47:19','1173','','19056866388'),
('2012-11-20 08:47:37','1173','','19056866388'),('2012-11-20 08:49:18','1173','','19056866388'),
('2012-11-20 08:51:33','1173','','19056866388'),('2012-11-20 08:53:57','1173','','19056866388'),
('2012-11-20 08:51:27','1173','','19056866388'),('2012-11-20 08:56:25','1173','','19056866388'),
('2012-11-20 08:58:07','1173','','19056866388'),('2012-11-20 08:58:58','1173','','19056866388');
/*语句如下*/
mysql> select a.phone_number,b.phone_number,a.v_id,b.v_id,a.call_time,b.start_ti
me,TIMESTAMPDIFF(second,a.call_time,b.start_time) as time_d from data_log_1 a le
ft join data_log_2 b on a.phone_number=b.phone_number and a.user=b.user and TIME
STAMPDIFF(second,a.call_time,b.start_time) BETWEEN -10 and 20;
+--------------+--------------+-------------------+------+---------------------+
---------------------+--------+
| phone_number | phone_number | v_id | v_id | call_time |
start_time | time_d |
+--------------+--------------+-------------------+------+---------------------+
---------------------+--------+
| 19056866388 | 19056866388 | 1353372410.254453 | | 2012-11-20 08:46:51 |
2012-11-20 08:46:51 | 0 |
| 19056866388 | 19056866388 | 1353372439.254467 | | 2012-11-20 08:47:19 |
2012-11-20 08:47:19 | 0 |
| 19056866388 | 19056866388 | 1353372439.254467 | | 2012-11-20 08:47:19 |
2012-11-20 08:47:37 | 18 |
| 19056866388 | 19056866388 | 1353372457.254475 | | 2012-11-20 08:47:37 |
2012-11-20 08:47:37 | 0 |
| 19056866388 | 19056866388 | 1353372557.254517 | | 2012-11-20 08:49:18 |
2012-11-20 08:49:18 | 0 |
| 19056866388 | 19056866388 | 1353372692.254549 | | 2012-11-20 08:51:33 |
2012-11-20 08:51:33 | 0 |
| 19056866388 | 19056866388 | 1353372692.254549 | | 2012-11-20 08:51:33 |
2012-11-20 08:51:27 | -6 |
| 19056866388 | 19056866388 | 1353372837.254584 | | 2012-11-20 08:53:58 |
2012-11-20 08:53:57 | -1 |
| 19056866388 | NULL | 1353372910.254617 | NULL | 2012-11-20 08:55:11 |
NULL | NULL |
| 19056866388 | 19056866388 | 1353372984.254644 | | 2012-11-20 08:56:25 |
2012-11-20 08:56:25 | 0 |
| 19056866388 | 19056866388 | 1353373087.254693 | | 2012-11-20 08:58:07 |
2012-11-20 08:58:07 | 0 |
| 19056866388 | 19056866388 | 1353373138.254726 | | 2012-11-20 08:58:58 |
2012-11-20 08:58:58 | 0 |
+--------------+--------------+-------------------+------+---------------------+
---------------------+--------+
12 rows in set (0.00 sec)
/*
正常情况下应该是匹配10条结果,但现在的情况是 第2条数据 插入时间为 2012-11-20 08:47:19, 第3条 为 2012-11-20 08:47:37
相隔的很近,按上面的语句没法将这两条区分出来。第6条与第7条情况一样,都是匹配到了两条。我想找的是时间间隔最短最相近的那条。
现在如果想用上面的条件将 data_log_1 中v_id 字段 update到 data_log_2 中的v_id,可能会出现不是时间间隔最近的V_id值被更新了过去,请问该如何写这条更新语句呢?
*/
/*原语句如下*/
update data_log_2 a,data_log_1 b set a.v_id=b.v_id where a.phone_number=b.phone_number and a.user=b.user and TIMESTAMPDIFF(second,b.call_time,a.start_time) BETWEEN -10 and 20