xarray now supports multi-dimensional concatenation directly through open_mfdataset
.
The documentation on combining data along multiple dimensions is here, but as your question is very similar to this one, I'm going to copy the key parts of my answer here:
You have a 2D concatenation problem: you need to arrange the datasets such that when joined up along x and y, they make a larger dataset which also has dimensions x and y.
As long as len(x)
is the same in every file, and len(y)
is the same in every file, you should in theory be able to do this in one or two different ways.
1) Using combine='nested'
You can manually specify the order that you need them joined up in. xarray allows you to do this by passing the datasets as a grid, specified as a nested list. In your case, if we had 4 files (named [upper_left, upper_right, lower_left, lower_right]), we would combine them like so:
from xarray import open_mfdataset
grid = [[upper_left, upper_right],
[lower_left, lower_right]]
ds = open_mfdataset(grid, concat_dim=['x', 'y'], combine='nested')
We had to tell open_mfdataset
which dimensions of the data the rows and colums of the grid corresponded to, so it would know which dimensions to concatenate the data along. That's why we needed to pass concat_dim=['x', 'y']
.
2) Using combine='by_coords'
But your data has coordinates in it already - can't xarray just use those to arrange the datasets in the right order? That is what the combine='by_coords'
option is for, but unfortunately, it requires 1-dimensional coordinates (also known as dimensional coordinates) to arrange the data. If your files don't have any of those the printout will says Dimensions without coordinates: x, y
).
If you can add 1-dimensional coordinates to your files first, then you could use combine='by_coords'
, then you could just pass a list of all the files in any order, i.e.
ds = open_mfdataset([file1, file2, ...], combine='by_coords')
But otherwise you'll have to use combine='nested'
.