recently I installed Hadoop and other frameworks such as Hive/TPCDS/TeraGen/SPARK into container, and configure it to work with S3.

thus, its possible to access the same S3-data via all of those utilities.

the container resides on docker-hub (in order to download it, all it needs is to run-it)

attached document describing the container content and how to use it.

will appreciate your comments.

Gal.