While working on the “youtube-mp3-archive” project, I stumbled across two issues which are worth to be documented for future use.
“aws s3 ls” shows “???” instead of the UTF-8 key names of the S3 objects
On my machine this happens when I pipe the output of “aws s3 ls” to another program. Here is an example:
$ aws s3 ls --recursive s3://youtube-mp3.famzah/ | tee | grep 4185710 2016-10-30 08:08:49 4185710 mp3/Youtube/??????? - ?? ???? ?????-BF6KuR8vWN0.mp3
There is already a discussion about this at the AWS CLI project. The solution in my case was to tamper with the PYTHONIOENCODING environment variable and force UTF-8:
$ PYTHONIOENCODING=utf8 aws s3 ls --recursive s3://youtube-mp3.famzah/ | tee | grep 4185710 2016-10-30 08:08:49 4185710 mp3/Youtube/Аналгин - Тя беше ангел-BF6KuR8vWN0.mp3
How to convert all stored S3 objects to another Storage Class
As already explained, the Storage Class cannot be set on a per-bucket basis. It must be specified with each upload operation in your client.
The migration procedure is already documented at the AWS CLI project. Here are the commands to check the current Storage Class of all objects in an S3 bucket, and how to convert them to a different Storage Class:
# all our S3 objects are using the "Standard" Storage Class $ aws s3api list-objects --bucket youtube-mp3.famzah | grep StorageClass | sort | uniq -c 749 "StorageClass": "STANDARD" # convert without re-uploading the objects from your computer aws s3 cp --recursive --storage-class STANDARD_IA s3://youtube-mp3.famzah/ s3://youtube-mp3.famzah/ # all our S3 objects are now using the "Standard-Infrequent" Storage Class $ aws s3api list-objects --bucket youtube-mp3.famzah | grep StorageClass | sort | uniq -c 749 "StorageClass": "STANDARD_IA"
The reason to use a different Storage Class is pricing.
December 15, 2020 at 3:23 pm
Thanks a lot for your pythonencoding trick