######### Hadoop 2 ######### Overview ======== ===== =================== Port Description ===== =================== 50070 hadoop namenode web 54310 hadoop namenode 50010 hadoop datanode 50020 hadoop datanode 50075 hadoop datanode 8030 hadoop resourcemanager 8031 hadoop resourcemanager 8032 hadoop resourcemanager 8033 hadoop resourcemanager 8088 hadoop resourcemanager web 13562 hadoop nodemanager 8040 hadoop nodemanager 34568 hadoop nodemanager 8042 hadoop nodemanager 19888 hadoop job history web 10020 hadoop job history 10033 hadoop job history ===== =================== Installation ============ * Edit ``etc/hadoop/core-site.xml`` to configure tmp dir and location of name node .. code-block:: bash hadoop.tmp.dir /local/hadoop/tmp A base for other temporary directories. fs.defaultFS hdfs://127.0.0.1:54310 hadoop.security.authorization true hadoop.http.staticuser.user hdfs * Edit ``etc/hadoop/mapred-site.xml`` to set the locations of the job tracker and its working dir .. code-block:: bash mapred.job.tracker 127.0.0.1:54311 mapreduce.jobtracker.staging.root.dir /user mapred.tasktracker.map.tasks.maximum 16 mapred.tasktracker.reduce.tasks.maximum 16 mapreduce.framework.name yarn mapreduce.jobtracker.expire.trackers.interval 600000 Expert: The time-interval, in miliseconds, after which a tasktracker is declared 'lost' if it doesn't send heartbeats. mapreduce.jobtracker.restart.recover false "true" to enable (job) recovery upon restart, "false" to start afresh mapreduce.task.io.sort.factor 10 The number of streams to merge at once while sorting files. This determines the number of open file handles. mapreduce.task.io.sort.mb 100 The total amount of buffer memory to use while sorting files, in megabytes. By default, gives each merge stream 1MB, which should minimize seeks. mapreduce.tasktracker.http.threads 40 The number of worker threads that for the http server. This is used for map output fetching mapreduce.task.timeout 600000 The number of milliseconds before a task will be terminated if it neither reads an input, writes an output, nor updates its status string. A value of 0 disables the timeout. mapreduce.task.tmp.dir ./tmp To set the value of tmp directory for map and reduce tasks. If the value is an absolute path, it is directly assigned. Otherwise, it is prepended with task's working directory. The java tasks are executed with option -Djava.io.tmpdir='the absolute path of the tmp dir'. Pipes and streaming are set with environment variable, TMPDIR='the absolute path of the tmp dir' mapreduce.output.fileoutputformat.compress false Should the job outputs be compressed? mapreduce.shuffle.ssl.enabled false Whether to use SSL for for the Shuffle HTTP endpoints. * Edit ``etc/hadoop/hdfs-site.xml`` to set working dirs of name and data node and how often a file gets replicated .. code-block:: bash dfs.replication 3 dfs.data.dir /data/hadoop/data-node dfs.name.dir /data/hadoop/name-node dfs.permissions.supergroup hadoop dfs.namenode.accesstime.precision 3600000 The access time for HDFS file is precise upto this value. The default value is 1 hour. Setting a value of 0 disables access times for HDFS. dfs.permissions.enabled true If "true", enable permission checking in HDFS. If "false", permission checking is turned off, but all other behavior is unchanged. Switching from one parameter value to the other does not change the mode, owner or group of files or directories. dfs.namenode.fs-limits.min-block-size 1048576 Minimum block size in bytes, enforced by the Namenode at create time. This prevents the accidental creation of files with tiny block sizes (and thus many blocks), which can degrade performance. dfs.blocksize 134217728 The default block size for new files, in bytes. You can use the following suffix (case insensitive): k(kilo), m(mega), g(giga), t(tera), p(peta), e(exa) to specify the size (such as 128k, 512m, 1g, etc.), Or provide complete size in bytes (such as 134217728 for 128 MB). dfs.namenode.fs-limits.max-blocks-per-file 1048576 Maximum number of blocks per file, enforced by the Namenode on write. This prevents the creation of extremely large files which can degrade performance. dfs.heartbeat.interval 3 Determines datanode heartbeat interval in seconds. dfs.namenode.handler.count 10 The number of server threads for the namenode. dfs.namenode.name.dir.restore false Set to true to enable NameNode to attempt recovering a previously failed dfs.namenode.name.dir. When enabled, a recovery of any failed directory is attempted during checkpoint. dfs.image.compress false Should the dfs image be compressed? dfs.image.transfer.bandwidthPerSec 0 Maximum bandwidth used for image transfer in bytes per second. This can help keep normal namenode operations responsive during checkpointing. The maximum bandwidth and timeout in dfs.image.transfer.timeout should be set such that normal image transfers can complete successfully. A default value of 0 indicates that throttling is disabled. dfs.datanode.max.transfer.threads 4096 Specifies the maximum number of threads to use for transferring data in and out of the DN. dfs.ha.automatic-failover.enabled false Whether automatic failover is enabled. See the HDFS High Availability documentation for details on automatic HA configuration. dfs.webhdfs.enabled false Enable WebHDFS (REST API) in Namenodes and Datanodes. dfs.https.enable false Decide if HTTPS(SSL) is supported on HDFS * Edit ``etc/hadoop/yarn-site.xml`` .. code-block:: bash yarn.resourcemanager.resource-tracker.address [% HADOOP_MASTER %]:8031 host is the hostname of the resource manager and port is the port on which the NodeManagers contact the Resource Manager. yarn.resourcemanager.scheduler.address 127.0.0.1:8030 host is the hostname of the resourcemanager and port is the port on which the Applications in the cluster talk to the Resource Manager. yarn.resourcemanager.scheduler.class org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler In case you do not want to use the default scheduler yarn.nodemanager.local-dirs /data/hadoop/nm the local directories used by the nodemanager yarn.nodemanager.address 127.0.0.1:8040 the nodemanagers bind to this port yarn.nodemanager.resource.memory-mb 10240 the amount of memory on the NodeManager in GB yarn.nodemanager.remote-app-log-dir /app-logs directory on hdfs where the application logs are moved to yarn.nodemanager.log-dirs the directories used by Nodemanagers as log directories yarn.nodemanager.aux-services mapreduce_shuffle shuffle service that needs to be set for Map Reduce to run * Create a hadoop user with an SSH key .. code-block:: bash useradd -d /opt/hadoop hadoop chown -R hadoop /opt/hadoop su - hadoop ssh-keygen cat .ssh/id_rsa.pub > .ssh/authorized_keys chmod 400 .ssh/authorized_keys ssh localhost * Format the HDFS .. code-block:: bash su - hadoop -c '/opt/hadoop/bin/hdfs namenode -format -force' * Start the services .. code-block:: bash su - hadoop -c '/opt/hadoop/sbin/hadoop-daemon.sh start namenode && /opt/hadoop/sbin/hadoop-daemon.sh start datanode' && /opt/hadoop/sbin/yarn-daemon.sh start resourcemanager && /opt/hadoop/sbin/yarn-daemon.sh start nodemanager'" Check status ============ * HDFS .. code-block:: bash /opt/hadoop/bin/hdfs dfsadmin -report * YARN .. code-block:: bash /opt/hadoop/bin/yarn node -list * Test Namenode .. code-block:: bash su - hadoop -c '/opt/hadoop/bin/hadoop fs -mkdir /user' su - hadoop -c '/opt/hadoop/bin/hadoop fs -mkdir /user/hadoop' * Test Datanode .. code-block:: bash su - hadoop -c '/opt/hadoop/bin/hadoop fs -put /opt/hadoop/etc/hadoop/hadoop-env.sh /user/hadoop/hadoop-env' * Test YARN .. code-block:: bash su - hadoop -c "/opt/hadoop/bin/hadoop jar /opt/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar pi 2 10" Configure High Availability =========================== * Edit core-site.xml .. code-block:: bash fs.defaultFS hdfs://mycluster ha.zookeeper.quorum zookeeper1:2181,zookeeper2:2181,zookeeper3:2181 * Edit hdfs-site.xml .. code-block:: bash dfs.nameservices mycluster dfs.ha.namenodes.mycluster nn1,nn2 dfs.namenode.rpc-address.mycluster.nn1 hadoop_master:8020 dfs.namenode.rpc-address.mycluster.nn2 hadoop_master2:8020 dfs.namenode.http-address.mycluster.nn1 hadoop_master:50070 dfs.namenode.http-address.mycluster.nn2 hadoop_master2:50070 dfs.namenode.shared.edits.dir qjournal://hadoop_master:8485;hadoop_master2:8485/mycluster dfs.client.failover.proxy.provider.mycluster org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider dfs.ha.fencing.methods sshfence dfs.ha.fencing.ssh.private-key-files /opt/hadoop/.ssh/id_rsa dfs.ha.automatic-failover.enabled true Whether automatic failover is enabled. See the HDFS High Availability documentation for details on automatic HA configuration. ha.zookeeper.quorum zookeeper1:2181,zookeeper2:2181,zookeeper3:2181 * Edit yarn-site.xml .. code-block:: bash yarn.resourcemanager.ha.enabled true yarn.resourcemanager.cluster-id mycluster yarn.resourcemanager.ha.rm-ids rm1,rm2 yarn.resourcemanager.hostname.rm1 hadoop_master yarn.resourcemanager.hostname.rm2 hadoop_master2 yarn.resourcemanager.webapp.address.rm1 hadoop_master:8088 yarn.resourcemanager.webapp.address.rm2 hadoop_master2:8088 yarn.resourcemanager.zk-address zookeeper1:2181,zookeeper2:2181,zookeeper3:2181 * Start Zookeeper Fencing Service additionally to normal Zookeeper .. code-block:: bash /opt/hadoop/sbin/hadoop-daemon.sh --script /opt/hadoop/bin/hdfs start zkfc * Start HDFS Journalnode on Namenode servers .. code-block:: bash /opt/hadoop/sbin/hadoop-daemon.sh start journalnode * Initialize and format Namenode .. code-block:: bash /opt/hadoop/bin/hdfs zkfc -formatZK /opt/hadoop/bin/hdfs namenode -format -force /opt/hadoop/sbin/hadoop-daemon.sh start namenode * Check the status .. code-block:: bash bin/hdfs haadmin -getServiceState nn1 bin/hdfs haadmin -getServiceState nn2 bin/yarn rmadmin -getServiceState rm1 bin/yarn rmadmin -getServiceState rm2 Convert single namenode to HA ============================= .. code-block:: bash /opt/hadoop/bin/hdfs namenode -bootstrapStandby /opt/hadoop/bin/hdfs namenode -initializeSharedEdits Configure Capacity Scheduler ============================ * The CapacityScheduler is designed to allow sharing a large cluster while giving each organization a minimum capacity guarantee. * Make sure ``org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler`` is set as ``yarn.resourcemanager.scheduler.class`` in ``etc/hadoop/yarn-site.xml`` * Configure resources for unix groups a, b and default * Edit ``etc/hadoop/capacity-scheduler.xml`` .. code-block:: bash yarn.scheduler.capacity.root.queues a,b,default The queues at the this level (root is the root queue). yarn.scheduler.capacity.root.a.capacity 30 Default queue target capacity. yarn.scheduler.capacity.root.a.user-limit-factor 1 Default queue user limit a percentage from 0.0 to 1.0. yarn.scheduler.capacity.root.a.maximum-capacity 100 The maximum capacity of the default queue. yarn.scheduler.capacity.root.a.state RUNNING The state of the default queue. State can be one of RUNNING or STOPPED. yarn.scheduler.capacity.root.a.acl_submit_applications group_a The ACL of who can submit jobs to the default queue. yarn.scheduler.capacity.root.b.capacity 30 Default queue target capacity. yarn.scheduler.capacity.root.b.user-limit-factor 1 Default queue user limit a percentage from 0.0 to 1.0. yarn.scheduler.capacity.root.b.maximum-capacity 100 The maximum capacity of the default queue. yarn.scheduler.capacity.root.b.state RUNNING The state of the default queue. State can be one of RUNNING or STOPPED. yarn.scheduler.capacity.root.b.acl_submit_applications group_b The ACL of who can submit jobs to the default queue. yarn.scheduler.capacity.root.default.capacity 40 Default queue target capacity. yarn.scheduler.capacity.root.default.user-limit-factor 1 Default queue user limit a percentage from 0.0 to 1.0. yarn.scheduler.capacity.root.default.maximum-capacity 100 The maximum capacity of the default queue. yarn.scheduler.capacity.root.default.state RUNNING The state of the default queue. State can be one of RUNNING or STOPPED. yarn.scheduler.capacity.root.default.acl_submit_applications * The ACL of who can submit jobs to the default queue. * Refresh queues .. code-block:: bash bin/yarn rmadmin -refreshQueues bin/hadoop queue -list * Submit a test job .. code-block:: bash bin/hadoop jar /opt/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar pi -Dmapred.job.queue.name=a 2 10 Configure Fair Scheduler ======================== * All jobs get, on average, an equal share of resources over time * Make sure ``org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler`` is set as ``yarn.resourcemanager.scheduler.class`` in ``etc/hadoop/yarn-site.xml`` * A pool has the same name as the user * Create file ``etc/hadoop/fair-scheduler.xml`` .. code:: xml 5 90 20 2.0 3 2 * Restart resource manager HDFS NFS Gateway ================ * Edit ``hdfs-site.xml`` .. code-block:: bash dfs.nfs3.dump.dir /data/hadoop/.hdfs-nfs dfs.nfs.exports.allowed.hosts * rw * Edit ``core-site.xml`` .. code-block:: bash hadoop.proxyuser.nfsserver.groups * The 'nfsserver' user is allowed to proxy all members of the 'nfs-users1' and 'nfs-users2' groups. Set this to '*' to allow nfsserver user to proxy any group. hadoop.proxyuser.nfsserver.hosts * This is the host where the nfs gateway is running. Set this to '*' to allow requests from any hosts to be proxied. * Start the daemons .. code-block:: bash systemctl stop rpcbind sbin/hadoop-daemon.sh start portmap su - nfsserver -c "sbin/hadoop-daemon.sh start nfs3" * Test nfs mount .. code-block:: bash rpcinfo -p localhost showmount -e mount -t nfs -o vers=3,proto=tcp,nolock localhost:/ /mnt Troubleshooting =============== * ``java.lang.IllegalArgumentException: Illegal capacity of -1.0 for queue`` -> You dont have defined a capacity for the queue like yarn.scheduler.capacity.root.$QUEUENAME.capacity * ``org.apache.hadoop.util.Shell$ExitCodeException: chmod: cannot access `/user/myuser1544460269/.staging/job_local1544460269_0001': No such file or directory`` -> set mapreduce.framework.name to yarn in mapred-site.xml * datanode says ``Initialization failed for Block pool`` -> Namenode got formated / changed afterwards, delete the contents of dfs.data.dir