Hadoop

The Hadoop software architecture deployed on OSIRIM is based on the Hadoop Hortonworks HDP 2.6 distribution. She is made of:

• A cluster of 48 cores spread over 6 calculation servers. It is structured as follows:

- A node of user connection (osirim-hadoop.irit.fr) This node is the "client" node on which the users connect to start the treatments of all types (mapreduce, hbase, hive, spark, pig, ...) and to access the data via the hdfs protocol.

- Computing nodes (co2-hdp26-worker-01, ..., co2-hdp26-worker-06). These 6 nodes are dedicated computing servers, each with 8 cores and 64 GB of RAM. No user can connect to it. A process running on a compute node (map/reduce processes for example) accesses data hosted on the storage area, performs a process and saves the result on that area. This process is managed by the Hadoop cluster.

• A storage area with a capacity of about 1 Po. This storage is provided by a bay composed of 11 nodes (Node 1, ... Node 11).​