Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.1k views
in Technique[技术] by (71.8m points)

sqlalchemy - Python - Speeding up imports from CSV files with unknown columns to database

I have csv files containing user-submitted form data from a few websites. These files could have any number of columns (one per form field) while the values could be anything. A few columns are constant, such as Form ID and Form URL. I need to dynamically create tables for each form/csv and input the data to a predefined MySQL database.

I wrote a script some time ago leveraging MySQLdb that does exactly this, but at a rate of roughly 2 rows per second. In total I have about 90K rows of data.

My process was this:

  • Grab a part of the site name, form name and form ID to dynamically create a table name
  • Create the table if it does not exist, with string concatenated SQL syntax. Table names are derived from csv fieldnames stripped of special characters and made snake case. Tables are typed VARCHAR
  • Loop through the csv rows and INSERT to the table, using a dictionary and placeholders to prevent SQL injection

Upon revision I noticed I didn't make use of bulk_query() or executemany(), which should make a difference. But it would likely also be better to do away with string concatenation and make use of SQLAlchemy instead. However, I understand that would require predefined classes to build the models from. Is that something that could be defined on the fly?

question from:https://stackoverflow.com/questions/65901327/python-speeding-up-imports-from-csv-files-with-unknown-columns-to-database

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)
Waitting for answers

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...