With Scala you might now want to use Amazon's official SDK for Java which provides the AmazonS3::listObjects
method:
import scala.collection.JavaConverters._
import com.amazonaws.services.s3.model.ObjectListing
def keys(bucket: String): List[String] = nextBatch(s3Client.listObjects(bucket))
private def nextBatch(listing: ObjectListing, keys: List[String] = Nil): List[String] = {
val pageKeys = listing.getObjectSummaries.asScala.map(_.getKey).toList
if (listing.isTruncated)
nextBatch(s3Client.listNextBatchOfObjects(listing), pageKeys ::: keys)
else
pageKeys ::: keys
}
Note the recursion on ObjectListing
objects:
Since the listing of keys in a bucket is done by batch (using a pagination system as documented here), only up to the first 1000 keys would be returned by s3Client.listObjects(bucket).getObjectSummaries.asScala.map(_.getKey)
.
Thus the recursive call in order to get all keys in a bucket by asking for the next page of keys while ObjectListing::isTruncated
is true.
Beware of memory issues if your bucket is huge though.
s3Client
can be built as such:
import com.amazonaws.services.s3.{AmazonS3, AmazonS3ClientBuilder}
import com.amazonaws.auth.{AWSStaticCredentialsProvider, BasicAWSCredentials}
val credentials = new BasicAWSCredentials(awsKey, awsAccessKey)
val s3Client: AmazonS3 = AmazonS3ClientBuilder.standard().withCredentials(new AWSStaticCredentialsProvider(credentials)).build()
with these requirements in build.sbt
and the latest version:
libraryDependencies ++= Seq(
"com.amazonaws" % "aws-java-sdk-bom" % "1.11.391",
"com.amazonaws" % "aws-java-sdk-s3" % "1.11.391"
)
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…