Sqoop import : composite primary key and textual primary key

Specify split column manually. Split column is not necessarily equal to PK. You can have complex PK and some int Split column. You can specify any integer column or even simple function (some simple function like substring or cast, not aggregation or analytic). Split column preferably should be evenly distributed integer. For example if your … Read more

Spark read file from S3 using sc.textFile (“s3n://…)

Confirmed that this is related to the Spark build against Hadoop 2.60. Just installed Spark 1.4.0 “Pre built for Hadoop 2.4 and later” (instead of Hadoop 2.6). And the code now works OK. sc.textFile(“s3n://bucketname/Filename”) now raises another error: java.lang.IllegalArgumentException: AWS Access Key ID and Secret Access Key must be specified as the username or password … Read more