recently I installed Hadoop and other frameworks such as Hive/TPCDS/TeraGen/SPARK  into container, and configure it to work with S3.

thus, its possible to access the same S3-data via all of those utilities.

the container resides on docker-hub (in order to download it, all it needs is to run-it)

attached document describing the container content and how to use it.

https://docs.google.com/document/d/1v5jiarEK6CEU0cstBjpyQ8T5ucTF7LWSwHe-FctAHbU/edit?usp=sharing

will appreciate your comments.

Gal.