Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
4.7k views
in Technique[技术] by (71.8m points)

google cloud storage - When using BigQuery transfer service with a CSV is it possible to only transfer certain columns? Not, all columns?

I am setting up a BigQuery transfer service to transfer a CSV stored in a GCS bucket into BigQuery.

However, I don't need all the columns in the CSV file. Is there a way of limiting the columns I transfer without having to manually remove the columns before the transfer?

Or, if I limited the columns in my BQ table to the ones I need, will BQ just ignore the other columns in the CSV file?

I have read the relevant page in the documentation but there is no mention of limiting columns.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

You can accomplish what you want if you manually specify the target table schema with the columns that you need. Then when you use the transfer service you need to set the option ignore_unknown_values to true.

Let's say I have a CSV on Google Cloud Storage with the following data:

"First"|"Second"|"Ignored"
"Third"|"Fourth"|"Ignored"

Then I have the table with the name test and schema like:

first_col   STRING  NULLABLE    
second_col  STRING  NULLABLE    

After configuring the transfer service with web UI and checking the checkbox "Ignore unknown values" I get the following data in the table:

first_col second_col
First Second
Third Fourth

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

2.1m questions

2.1m answers

60 comments

57.0k users

...