distcp from Hadoop cluster to AWS S3 bucket
Background
Access key IDs beginning with AKIA are long-term credentials for an IAM user or the AWS account root user. Access key IDs beginning with ASIA are temporary credentials that are created using STS operations. (details)
IAM role (which in my case) has,
1. Access key (AISAxxxx format)
2. Secret key
3. Security token
Where as IAM user, has 2 values:
1. Access key
2. Secret key
Hadoop commands
# hadoop commands with temporary credentials (ASIAxxxx format) $ hadoop fs \ -Dfs.s3a.access.key=ASIAXXXXXXXXX \ -Dfs.s3a.secret.key="xxxxxxxxxxxxxxxxx" \ -Dfs.s3a.session.token="xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" \ -Dfs.s3a.aws.credentials.provider=org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider \ -ls s3a://my_bucket_name/hadoop_files $ hadoop distcp \ -Dfs.s3a.access.key=ASIAXXXXXXXXX \ -Dfs.s3a.secret.key="xxxxxxxxxxxxxxxxx" \ -Dfs.s3a.session.token="xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" \ -Dfs.s3a.aws.credentials.provider=org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider \ hdfs:///folder/file s3a://my_bucket_name/hadoop_files/
You can extract the credentials from ~/.aws/credentials file in case you have the awscli installed.