Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
3.4k views
in Technique[技术] by (71.8m points)

mysql - String Concat over partition in SQL?

I want to concatenate the surrounding rows(in the following examples only the surrounding 2 rows) after ranking to a new column(group by seems to not work), and here is the data I have:

Schema (MySQL v8.0)

CREATE TABLE log_table (
  `user_id` VARCHAR(5),
  `date_time` DATETIME,
  `event_name` VARCHAR(10),
  `trivial` int
);

INSERT INTO log_table
  (`user_id`, `date_time`, `event_name`, `trivial`)
VALUES
  ('001', '2020-12-10 10:00:02', 'c', 3),
  ('001', '2020-12-10 10:00:01', 'b', 9),
  ('001', '2020-12-10 10:00:40', 'e', 2),
  ('001', '2020-12-10 10:00:20', 'd', 6),
  ('001', '2020-12-10 10:00:00', 'a', 1),
  ('002', '2020-12-10 10:00:10', 'C', 9),
  ('002', '2020-12-10 10:00:50', 'D', 0),
  ('002', '2020-12-10 10:00:02', 'A', 2),
  ('002', '2020-12-10 10:00:09', 'B', 4);

To illustrate what I want to do. I can do summing over numerical values using the sum clause as follows:
Query #1

SELECT
    *, SUM(trivial) over(partition by user_id order by user_id, date_time rows between 2 preceding and 2 following) AS trivial_new
FROM
    log_table;
user_id date_time event_name trivial trivial_new
001 2020-12-10 10:00:00 a 1 13
001 2020-12-10 10:00:01 b 9 19
001 2020-12-10 10:00:02 c 3 21
001 2020-12-10 10:00:20 d 6 20
001 2020-12-10 10:00:40 e 2 11
002 2020-12-10 10:00:02 A 2 15
002 2020-12-10 10:00:09 B 4 15
002 2020-12-10 10:00:10 C 9 15
002 2020-12-10 10:00:50 D 0 13

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

You can solve this problem by using CTEs: first computing row numbers for each user_id, ordered by date_time; then computing the minimum and maximum rows which should be included in each output string based on the number of rows before and after that are required, and then finally (effectively) JOINing the table to itself based on the row number being within that minimum/maximum row number range. Then you can just GROUP_CONCAT the event_name field from the JOINed table:

SET @before := 2;
SET @after := 2;

WITH rns AS (
  SELECT *, 
         ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY date_time) AS rn,
         COUNT(*) OVER (PARTITION BY user_id) AS numrows
  FROM log_table
),
minmax AS (
  SELECT user_id, 
         GREATEST(CAST(rn AS SIGNED) - @before, 1) AS first,
         LEAST(rn + @after, numrows) AS last,
         event_name
  FROM rns
)
SELECT rns.user_id, rns.date_time, rns.event_name,
       GROUP_CONCAT(mm.event_name SEPARATOR '') AS event_name_new
FROM rns
JOIN minmax mm ON mm.user_id = rns.user_id
              AND rns.rn BETWEEN mm.first AND mm.last
GROUP BY rns.user_id, rns.date_time, rns.event_name
ORDER BY rns.user_id, rn

Output (for 2 before and 2 after):

user_id     date_time               event_name  event_name_new
001         2020-12-10 10:00:00     a           abc
001         2020-12-10 10:00:01     b           abcd
001         2020-12-10 10:00:02     c           abcde
001         2020-12-10 10:00:20     d           bcde
001         2020-12-10 10:00:40     e           cde
002         2020-12-10 10:00:02     A           ABC
002         2020-12-10 10:00:09     B           ABCD
002         2020-12-10 10:00:10     C           ABCD
002         2020-12-10 10:00:50     D           BCD

Demo on db-fiddle


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...