So I am trying to upload a dataset to the microsoft cognitive services speech portal for custom models.
I have been doing this for about a year without issue, however now I am getting "Failed" with the detail "Failed to upload data. Please check your data format and try to upload again." ... very useful.
So does anyone know what could be causing the issue apart from the below which I have already checked.
Filesize is 1.3GB (zipped) / 1.8GB (unzipped) which is below the 2GB limit for "Max acoustic dataset file size for Data Import" as specified in https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/speech-services-quotas-and-limits#model-customization
The Trans.txt file is a properly formatted 1.3MB UTF-8 with a BOM text file with tab separated filename / text values as specified in https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/how-to-custom-speech-test-and-train
All entries in the Trans.txt file are present in the directory
All files in the directory have an associated entry in the Trans.txt file
All files are WAV files in the specified format.
Basically all of the above has been working for a year with the only thing that really changes is the size of the zip file which is still below limits.
On the off-chance someone from MS sees this, the dataset ID is: 7a3f240c-5eb7-4942-8e0f-7efa1b808eee
Related feedback post: https://feedback.azure.com/forums/932041-azure-cognitive-services/suggestions/42375118-actionable-error-messaging-in-speech-portal
After contacting MS support it appears something broke server-side related to the file-size even though we are within limits. They are working on fixing it.
2.1m questions
2.1m answers
60 comments
57.0k users