KEMBAR78
HDFS Configuration Settings Guide | PDF | Port (Computer Networking) | Cpu Cache
0% found this document useful (0 votes)
218 views14 pages

HDFS Configuration Settings Guide

This document contains configuration settings for Hadoop Distributed File System (HDFS). It defines settings such as the HDFS version, encryption settings, ports for various HDFS daemons, directories for storing data and metadata, permissions, and limits on file and block sizes. The configuration provides default values for settings like the replication factor, block size, and intervals for tasks like block reporting.

Uploaded by

Vinod Bihal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
218 views14 pages

HDFS Configuration Settings Guide

This document contains configuration settings for Hadoop Distributed File System (HDFS). It defines settings such as the HDFS version, encryption settings, ports for various HDFS daemons, directories for storing data and metadata, permissions, and limits on file and block sizes. The configuration provides default values for settings like the replication factor, block size, and intervals for tasks like block reporting.

Uploaded by

Vinod Bihal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

name

hadoop.hdfs.configuration.version

value
1

dfs.https.enable

false

dfs.http.policy

HTTP_ONLY

dfs.client.https.needauth

false

dfs.client.cached.conn.retry

dfs.https.server.keystore.resource

sslserver.xml

description
versionofthisconfigurationfile
Thelogginglevelfordfsnamenode.Othervaluesare"dir"(tracenamespacemutations),
"block"(traceblockunder/overreplicationsandblockcreations/deletions),or"all".
RPCaddressthathandlesallclientsrequests.InthecaseofHA/Federationwheremultiple
namenodesexist,thenameserviceidisaddedtothenamee.g.dfs.namenode.rpc
address.ns1dfs.namenode.rpcaddress.EXAMPLENAMESERVICEThevalueofthis
propertywilltaketheformofnnhost1:rpcport.
TheactualaddresstheRPCserverwillbindto.Ifthisoptionaladdressisset,itoverrides
onlythehostnameportionofdfs.namenode.rpcaddress.Itcanalsobespecifiedpername
nodeornameserviceforHA/Federation.Thisisusefulformakingthenamenodelisten
onallinterfacesbysettingitto0.0.0.0.
RPCaddressforHDFSServicescommunication.BackupNode,Datanodesandallother
servicesshouldbeconnectingtothisaddressifitisconfigured.Inthecaseof
HA/Federationwheremultiplenamenodesexist,thenameserviceidisaddedtothename
e.g.dfs.namenode.servicerpcaddress.ns1dfs.namenode.rpc
address.EXAMPLENAMESERVICEThevalueofthispropertywilltaketheformofnn
host1:rpcport.Ifthevalueofthispropertyisunsetthevalueofdfs.namenode.rpcaddress
willbeusedasthedefault.
TheactualaddresstheserviceRPCserverwillbindto.Ifthisoptionaladdressisset,it
overridesonlythehostnameportionofdfs.namenode.servicerpcaddress.Itcanalsobe
specifiedpernamenodeornameserviceforHA/Federation.Thisisusefulformakingthe
namenodelistenonallinterfacesbysettingitto0.0.0.0.
Thesecondarynamenodehttpserveraddressandport.
ThesecondarynamenodeHTTPSserveraddressandport.
Thedatanodeserveraddressandportfordatatransfer.
Thedatanodehttpserveraddressandport.
Thedatanodeipcserveraddressandport.
Thenumberofserverthreadsforthedatanode.
Theaddressandthebaseportwherethedfsnamenodewebuiwilllistenon.
TheactualadresstheHTTPserverwillbindto.Ifthisoptionaladdressisset,itoverrides
onlythehostnameportionofdfs.namenode.httpaddress.Itcanalsobespecifiedpername
nodeornameserviceforHA/Federation.ThisisusefulformakingthenamenodeHTTP
serverlistenonallinterfacesbysettingitto0.0.0.0.
Deprecated.Use"dfs.http.policy"instead.
DecideifHTTPS(SSL)issupportedonHDFSThisconfigurestheHTTPendpointfor
HDFSdaemons:Thefollowingvaluesaresupported:HTTP_ONLY:Serviceis
providedonlyonhttpHTTPS_ONLY:Serviceisprovidedonlyonhttps
HTTP_AND_HTTPS:Serviceisprovidedbothonhttpandhttps
WhetherSSLclientcertificateauthenticationisrequired
ThenumberoftimestheHDFSclientwillpullasocketfromthecache.Oncethisnumber
isexceeded,theclientwilltrytocreateanewsocket.
Resourcefilefromwhichsslserverkeystoreinformationwillbeextracted

dfs.namenode.logging.level

info

dfs.client.https.keystore.resource
dfs.datanode.https.address

sslclient.xml
0.0.0.0:50475

Resourcefilefromwhichsslclientkeystoreinformationwillbeextracted
Thedatanodesecurehttpserveraddressandport.

dfs.namenode.rpcaddress

dfs.namenode.rpcbindhost

dfs.namenode.servicerpcaddress

dfs.namenode.servicerpcbindhost
dfs.namenode.secondary.httpaddress
dfs.namenode.secondary.httpsaddress
dfs.datanode.address
dfs.datanode.http.address
dfs.datanode.ipc.address
dfs.datanode.handler.count
dfs.namenode.httpaddress

0.0.0.0:50090
0.0.0.0:50091
0.0.0.0:50010
0.0.0.0:50075
0.0.0.0:50020
10
0.0.0.0:50070

dfs.namenode.httpbindhost

dfs.namenode.httpsaddress

0.0.0.0:50470

dfs.namenode.httpsbindhost
dfs.datanode.dns.interface

default

dfs.datanode.dns.nameserver

default

dfs.namenode.backup.address

0.0.0.0:50100

dfs.namenode.backup.httpaddress

0.0.0.0:50105

dfs.namenode.replication.considerLoad

true

dfs.default.chunk.view.size
dfs.datanode.du.reserved

32768
0

dfs.namenode.name.dir

file://${hadoop.tmp.dir}/dfs/name

dfs.namenode.name.dir.restore

false

dfs.namenode.fslimits.maxcomponentlength

255

dfs.namenode.fslimits.maxdirectoryitems

1048576

dfs.namenode.fslimits.minblocksize

1048576

dfs.namenode.fslimits.maxblocksperfile

1048576

dfs.namenode.edits.dir

${dfs.namenode.name.dir}

dfs.namenode.shared.edits.dir
dfs.namenode.edits.journalplugin.qjournal

org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager

dfs.permissions.enabled

true

dfs.permissions.superusergroup

supergroup

dfs.namenode.acls.enabled

false

Thenamenodesecurehttpserveraddressandport.
TheactualadresstheHTTPSserverwillbindto.Ifthisoptionaladdressisset,itoverrides
onlythehostnameportionofdfs.namenode.httpsaddress.Itcanalsobespecifiedper
namenodeornameserviceforHA/Federation.Thisisusefulformakingthenamenode
HTTPSserverlistenonallinterfacesbysettingitto0.0.0.0.
ThenameoftheNetworkInterfacefromwhichadatanodeshouldreportitsIPaddress.
ThehostnameorIPaddressofthenameserver(DNS)whichaDataNodeshoulduseto
determinethehostnameusedbytheNameNodeforcommunicationanddisplaypurposes.
Thebackupnodeserveraddressandport.Iftheportis0thentheserverwillstartonafree
port.
Thebackupnodehttpserveraddressandport.Iftheportis0thentheserverwillstartona
freeport.
DecideifchooseTargetconsidersthetarget'sloadornot
Thenumberofbytestoviewforafileonthebrowser.
Reservedspaceinbytespervolume.Alwaysleavethismuchspacefreefornondfsuse.
DetermineswhereonthelocalfilesystemtheDFSnamenodeshouldstorethename
table(fsimage).Ifthisisacommadelimitedlistofdirectoriesthenthenametableis
replicatedinallofthedirectories,forredundancy.
SettotruetoenableNameNodetoattemptrecoveringapreviouslyfailed
dfs.namenode.name.dir.Whenenabled,arecoveryofanyfaileddirectoryisattempted
duringcheckpoint.
DefinesthemaximumnumberofbytesinUTF8encodingineachcomponentofapath.
Avalueof0willdisablethecheck.
Definesthemaximumnumberofitemsthatadirectorymaycontain.Avalueof0will
disablethecheck.
Minimumblocksizeinbytes,enforcedbytheNamenodeatcreatetime.Thispreventsthe
accidentalcreationoffileswithtinyblocksizes(andthusmanyblocks),whichcan
degradeperformance.
Maximumnumberofblocksperfile,enforcedbytheNamenodeonwrite.Thisprevents
thecreationofextremelylargefileswhichcandegradeperformance.
DetermineswhereonthelocalfilesystemtheDFSnamenodeshouldstorethetransaction
(edits)file.Ifthisisacommadelimitedlistofdirectoriesthenthetransactionfileis
replicatedinallofthedirectories,forredundancy.Defaultvalueissameas
dfs.namenode.name.dir
AdirectoryonsharedstoragebetweenthemultiplenamenodesinanHAcluster.This
directorywillbewrittenbytheactiveandreadbythestandbyinordertokeepthe
namespacessynchronized.Thisdirectorydoesnotneedtobelistedin
dfs.namenode.edits.dirabove.ItshouldbeleftemptyinanonHAcluster.
If"true",enablepermissioncheckinginHDFS.If"false",permissioncheckingisturned
off,butallotherbehaviorisunchanged.Switchingfromoneparametervaluetotheother
doesnotchangethemode,ownerorgroupoffilesordirectories.
Thenameofthegroupofsuperusers.
SettotruetoenablesupportforHDFSACLs(AccessControlLists).Bydefault,ACLs
aredisabled.WhenACLsaredisabled,theNameNoderejectsallRPCsrelatedtosetting
orgettingACLs.

dfs.namenode.lazypersist.file.scrub.interval.sec

300

dfs.block.access.token.enable

false

dfs.block.access.key.update.interval
dfs.block.access.token.lifetime

600
600

dfs.datanode.data.dir

file://${hadoop.tmp.dir}/dfs/data

dfs.datanode.data.dir.perm

700

dfs.replication

dfs.replication.max
dfs.namenode.replication.min

512
1

dfs.blocksize

134217728

dfs.client.block.write.retries

dfs.client.block.write.replacedatanodeonfailure.enable

true

dfs.client.block.write.replacedatanodeonfailure.policy

DEFAULT

dfs.client.block.write.replacedatanodeonfailure.besteffort false

dfs.blockreport.intervalMsec
dfs.blockreport.initialDelay

21600000
0

TheNameNodeperiodicallyscansthenamespaceforLazyPersistfileswithmissing
blocksandunlinksthemfromthenamespace.Thisconfigurationkeycontrolstheinterval
betweensuccessivescans.Setittoanegativevaluetodisablethisbehavior.
If"true",accesstokensareusedascapabilitiesforaccessingdatanodes.If"false",no
accesstokensarecheckedonaccessingdatanodes.
Intervalinminutesatwhichnamenodeupdatesitsaccesskeys.
Thelifetimeofaccesstokensinminutes.
DetermineswhereonthelocalfilesystemanDFSdatanodeshouldstoreitsblocks.Ifthis
isacommadelimitedlistofdirectories,thendatawillbestoredinallnameddirectories,
typicallyondifferentdevices.Directoriesthatdonotexistareignored.
PermissionsforthedirectoriesononthelocalfilesystemwheretheDFSdatanodestore
itsblocks.Thepermissionscaneitherbeoctalorsymbolic.
Defaultblockreplication.Theactualnumberofreplicationscanbespecifiedwhenthefile
iscreated.Thedefaultisusedifreplicationisnotspecifiedincreatetime.
Maximalblockreplication.
Minimalblockreplication.
Thedefaultblocksizefornewfiles,inbytes.Youcanusethefollowingsuffix(case
insensitive):k(kilo),m(mega),g(giga),t(tera),p(peta),e(exa)tospecifythesize(suchas
128k,512m,1g,etc.),Orprovidecompletesizeinbytes(suchas134217728for128
MB).
Thenumberofretriesforwritingblockstothedatanodes,beforewesignalfailuretothe
application.
Ifthereisadatanode/networkfailureinthewritepipeline,DFSClientwilltrytoremove
thefaileddatanodefromthepipelineandthencontinuewritingwiththeremaining
datanodes.Asaresult,thenumberofdatanodesinthepipelineisdecreased.Thefeatureis
toaddnewdatanodestothepipeline.Thisisasitewidepropertytoenable/disablethe
feature.Whentheclustersizeisextremelysmall,e.g.3nodesorless,cluster
administratorsmaywanttosetthepolicytoNEVERinthedefaultconfigurationfileor
disablethisfeature.Otherwise,usersmayexperienceanunusuallyhighrateofpipeline
failuressinceitisimpossibletofindnewdatanodesforreplacement.Seealso
dfs.client.block.write.replacedatanodeonfailure.policy
Thispropertyisusedonlyifthevalueofdfs.client.block.write.replacedatanodeon
failure.enableistrue.ALWAYS:alwaysaddanewdatanodewhenanexistingdatanodeis
removed.NEVER:neveraddanewdatanode.DEFAULT:Letrbethereplication
number.Letnbethenumberofexistingdatanodes.Addanewdatanodeonlyifris
greaterthanorequalto3andeither(1)floor(r/2)isgreaterthanorequaltonor(2)ris
greaterthannandtheblockishflushed/appended.
Thispropertyisusedonlyifthevalueofdfs.client.block.write.replacedatanodeon
failure.enableistrue.Besteffortmeansthattheclientwilltrytoreplaceafaileddatanode
inwritepipeline(providedthatthepolicyissatisfied),however,itcontinuesthewrite
operationincasethatthedatanodereplacementalsofails.Supposethedatanode
replacementfails.false:Anexceptionshouldbethrownsothatthewritewillfail.true:
Thewriteshouldberesumedwiththeremainingdatandoes.Notethatsettingthisproperty
totrueallowswritingtoapipelinewithasmallernumberofdatanodes.Asaresult,it
increasestheprobabilityofdataloss.
Determinesblockreportingintervalinmilliseconds.
Delayforfirstblockreportinseconds.
IfthenumberofblocksontheDataNodeisbelowthisthresholdthenitwillsendblock

dfs.blockreport.split.threshold

1000000

dfs.datanode.directoryscan.interval

21600

dfs.datanode.directoryscan.threads

dfs.heartbeat.interval
dfs.namenode.handler.count

3
10

dfs.namenode.safemode.thresholdpct

0.999f

dfs.namenode.safemode.min.datanodes

dfs.namenode.safemode.extension

30000

dfs.namenode.resource.check.interval

5000

dfs.namenode.resource.du.reserved

104857600

dfs.namenode.resource.checked.volumes
dfs.namenode.resource.checked.volumes.minimum

dfs.datanode.balance.bandwidthPerSec

1048576

dfs.hosts

dfs.hosts.exclude
dfs.namenode.max.objects

dfs.namenode.datanode.registration.iphostnamecheck

true

dfs.namenode.decommission.interval

30

dfs.namenode.decommission.nodes.per.interval

reportsforallStorageDirectoriesinasinglemessage.Ifthenumberofblocksexceeds
thisthresholdthentheDataNodewillsendblockreportsforeachStorageDirectoryin
separatemessages.Settozerotoalwayssplit.
IntervalinsecondsforDatanodetoscandatadirectoriesandreconcilethedifference
betweenblocksinmemoryandonthedisk.
Howmanythreadsshouldthethreadpoolusedtocompilereportsforvolumesinparallel
have.
Determinesdatanodeheartbeatintervalinseconds.
Thenumberofserverthreadsforthenamenode.
Specifiesthepercentageofblocksthatshouldsatisfytheminimalreplicationrequirement
definedbydfs.namenode.replication.min.Valueslessthanorequalto0meannottowait
foranyparticularpercentageofblocksbeforeexitingsafemode.Valuesgreaterthan1will
makesafemodepermanent.
Specifiesthenumberofdatanodesthatmustbeconsideredalivebeforethenamenode
exitssafemode.Valueslessthanorequalto0meannottotakethenumberoflive
datanodesintoaccountwhendecidingwhethertoremaininsafemodeduringstartup.
Valuesgreaterthanthenumberofdatanodesintheclusterwillmakesafemode
permanent.
Determinesextensionofsafemodeinmillisecondsafterthethresholdlevelisreached.
TheintervalinmillisecondsatwhichtheNameNoderesourcecheckerruns.Thechecker
calculatesthenumberoftheNameNodestoragevolumeswhoseavailablespacesaremore
thandfs.namenode.resource.du.reserved,andenterssafemodeifthenumberbecomes
lowerthantheminimumvaluespecifiedby
dfs.namenode.resource.checked.volumes.minimum.
Theamountofspacetoreserve/requireforaNameNodestoragedirectoryinbytes.The
defaultis100MB.
AlistoflocaldirectoriesfortheNameNoderesourcecheckertocheckinadditiontothe
localeditsdirectories.
TheminimumnumberofredundantNameNodestoragevolumesrequired.
Specifiesthemaximumamountofbandwidththateachdatanodecanutilizeforthe
balancingpurposeintermofthenumberofbytespersecond.
Namesafilethatcontainsalistofhoststhatarepermittedtoconnecttothenamenode.
Thefullpathnameofthefilemustbespecified.Ifthevalueisempty,allhostsare
permitted.
Namesafilethatcontainsalistofhoststhatarenotpermittedtoconnecttothe
namenode.Thefullpathnameofthefilemustbespecified.Ifthevalueisempty,nohosts
areexcluded.
Themaximumnumberoffiles,directoriesandblocksdfssupports.Avalueofzero
indicatesnolimittothenumberofobjectsthatdfssupports.
Iftrue(thedefault),thenthenamenoderequiresthataconnectingdatanode'saddressmust
beresolvedtoahostname.Ifnecessary,areverseDNSlookupisperformed.Allattempts
toregisteradatanodefromanunresolvableaddressarerejected.Itisrecommendedthat
thissettingbeleftontopreventaccidentalregistrationofdatanodeslistedbyhostnamein
theexcludesfileduringaDNSoutage.Onlysetthistofalseinenvironmentswherethere
isnoinfrastructuretosupportreverseDNSlookup.
Namenodeperiodicityinsecondstocheckifdecommissioniscomplete.
Thenumberofnodesnamenodechecksifdecommissioniscompleteineach

dfs.namenode.decommission.interval.
dfs.namenode.replication.interval

Theperiodicityinsecondswithwhichthenamenodecomputesrepliactionworkfor
datanodes.

dfs.namenode.accesstime.precision

3600000

TheaccesstimeforHDFSfileispreciseuptothisvalue.Thedefaultvalueis1hour.
Settingavalueof0disablesaccesstimesforHDFS.

dfs.datanode.plugins

Commaseparatedlistofdatanodepluginstobeactivated.

dfs.namenode.plugins

Commaseparatedlistofnamenodepluginstobeactivated.
Thesizeofbuffertostreamfiles.Thesizeofthisbuffershouldprobablybeamultipleof
hardwarepagesize(4096onIntelx86),anditdetermineshowmuchdataisbuffered
duringreadandwriteoperations.

dfs.streambuffersize

4096

dfs.bytesperchecksum
dfs.clientwritepacketsize

512
65536

Thenumberofbytesperchecksum.Mustnotbelargerthandfs.streambuffersize
Packetsizeforclientstowrite
ThemaximumperiodtokeepaDNintheexcludednodeslistataclient.Afterthisperiod,
inmilliseconds,thepreviouslyexcludednode(s)willberemovedautomaticallyfromthe
cacheandwillbeconsideredgoodforblockallocationsagain.Usefultolowerorraisein
situationswhereyoukeepafileopenforverylongperiods(suchasaWriteAheadLog
(WAL)file)tomakethewritertoleranttoclustermaintenancerestarts.Defaultsto10
minutes.
DetermineswhereonthelocalfilesystemtheDFSsecondarynamenodeshouldstorethe
temporaryimagestomerge.Ifthisisacommadelimitedlistofdirectoriesthentheimage
isreplicatedinallofthedirectoriesforredundancy.

dfs.client.write.exclude.nodes.cache.expiry.interval.millis

600000

dfs.namenode.checkpoint.dir

file://${hadoop.tmp.dir}/dfs/namesecondary

dfs.namenode.checkpoint.edits.dir

${dfs.namenode.checkpoint.dir}

dfs.namenode.checkpoint.period

3600

dfs.namenode.checkpoint.txns

1000000

dfs.namenode.checkpoint.check.period

60

TheSecondaryNameNodeandCheckpointNodewillpolltheNameNodeevery
'dfs.namenode.checkpoint.check.period'secondstoquerythenumberofuncheckpointed
transactions.

dfs.namenode.checkpoint.maxretries

TheSecondaryNameNoderetriesfailedcheckpointing.Ifthefailureoccurswhileloading
fsimageorreplayingedits,thenumberofretriesislimitedbythisvariable.

dfs.namenode.num.checkpoints.retained

dfs.namenode.num.extra.edits.retained

1000000

dfs.namenode.max.extra.edits.segments.retained

10000

DetermineswhereonthelocalfilesystemtheDFSsecondarynamenodeshouldstorethe
temporaryeditstomerge.Ifthisisacommadelimitedlistofdirectoiresthenteheditsis
replicatedinallofthedirectoiresforredundancy.Defaultvalueissameas
dfs.namenode.checkpoint.dir
Thenumberofsecondsbetweentwoperiodiccheckpoints.
TheSecondaryNameNodeorCheckpointNodewillcreateacheckpointofthenamespace
every'dfs.namenode.checkpoint.txns'transactions,regardlessofwhether
'dfs.namenode.checkpoint.period'hasexpired.

ThenumberofimagecheckpointfilesthatwillberetainedbytheNameNodeand
SecondaryNameNodeintheirstoragedirectories.Alleditlogsnecessarytorecoveran
uptodatenamespacefromtheoldestretainedcheckpointwillalsoberetained.
Thenumberofextratransactionswhichshouldberetainedbeyondwhatisminimally
necessaryforaNNrestart.ThiscanbeusefulforauditpurposesorforanHAsetupwhere
aremoteStandbyNodemayhavebeenofflineforsometimeandneedtohavealonger
backlogofretainededitsinordertostartagain.Typicallyeacheditisontheorderofa
fewhundredbytes,sothedefaultof1millioneditsshouldbeontheorderofhundredsof
MBsorlowGBs.NOTE:Fewerextraeditsmayberetainedthanvaluespecifiedforthis
settingifdoingsowouldmeanthatmoresegmentswouldberetainedthanthenumber
configuredbydfs.namenode.max.extra.edits.segments.retained.
Themaximumnumberofextraeditlogsegmentswhichshouldberetainedbeyondwhat
isminimallynecessaryforaNNrestart.Whenusedinconjunctionwith
dfs.namenode.num.extra.edits.retained,thisconfigurationpropertyservestocapthe

numberofextraeditsfilestoareasonablevalue.
dfs.namenode.delegation.key.updateinterval
dfs.namenode.delegation.token.maxlifetime

86400000
604800000

Theupdateintervalformasterkeyfordelegationtokensinthenamenodeinmilliseconds.
Themaximumlifetimeinmillisecondsforwhichadelegationtokenisvalid.

dfs.namenode.delegation.token.renewinterval

86400000

Therenewalintervalfordelegationtokeninmilliseconds.

dfs.datanode.failed.volumes.tolerated

Thenumberofvolumesthatareallowedtofailbeforeadatanodestopsofferingservice.
Bydefaultanyvolumefailurewillcauseadatanodetoshutdown.

dfs.image.compress

false

dfs.image.compression.codec

org.apache.hadoop.io.compress.DefaultCodec

dfs.image.transfer.timeout

60000

Shouldthedfsimagebecompressed?
Ifthedfsimageiscompressed,howshouldtheybecompressed?Thishastobeacodec
definedinio.compression.codecs.
Sockettimeoutforimagetransferinmilliseconds.Thistimeoutandtherelated
dfs.image.transfer.bandwidthPerSecparametershouldbeconfiguredsuchthatnormal
imagetransfercancompletesuccessfully.Thistimeoutpreventsclienthangswhenthe
senderfailsduringimagetransfer.Thisissockettimeoutduringimagetranfer.

dfs.image.transfer.bandwidthPerSec

Maximumbandwidthusedforimagetransferinbytespersecond.Thiscanhelpkeep
normalnamenodeoperationsresponsiveduringcheckpointing.Themaximumbandwidth
andtimeoutindfs.image.transfer.timeoutshouldbesetsuchthatnormalimagetransfers
cancompletesuccessfully.Adefaultvalueof0indicatesthatthrottlingisdisabled.

dfs.image.transfer.chunksize

65536

Chunksizeinbytestouploadthecheckpoint.Chunkedstreamingisusedtoavoidinternal
bufferingofcontentsofimagefileofhugesize.

dfs.namenode.support.allow.format

true

DoesHDFSnamenodeallowitselftobeformatted?Youmayconsidersettingthistofalse
foranyproductioncluster,toavoidanypossibilityofformattingarunningDFS.

dfs.datanode.max.transfer.threads

4096

Specifiesthemaximumnumberofthreadstousefortransferringdatainandoutofthe
DN.

dfs.datanode.scan.period.hours

dfs.block.scanner.volume.bytes.per.second

1048576

dfs.datanode.readahead.bytes

4193404

dfs.datanode.drop.cache.behind.reads

false

dfs.datanode.drop.cache.behind.writes

false

Ifthisis0ornegative,theDataNode'sblockscannerwillbedisabled.Ifthisispositive,
theDataNodewillnotscananyindividualblockmorethanonceinthespecifiedscan
period.
Ifthisis0,theDataNode'sblockscannerwillbedisabled.Ifthisispositive,thisisthe
numberofbytespersecondthattheDataNode'sblockscannerwilltrytoscanfromeach
volume.
Whilereadingblockfiles,iftheHadoopnativelibrariesareavailable,thedatanodecan
usetheposix_fadvisesystemcalltoexplicitlypagedataintotheoperatingsystembuffer
cacheaheadofthecurrentreader'sposition.Thiscanimproveperformanceespecially
whendisksarehighlycontended.Thisconfigurationspecifiesthenumberofbytesahead
ofthecurrentreadpositionwhichthedatanodewillattempttoreadahead.Thisfeature
maybedisabledbyconfiguringthispropertyto0.Ifthenativelibrariesarenotavailable,
thisconfigurationhasnoeffect.
Insomeworkloads,thedatareadfromHDFSisknowntobesignificantlylargeenough
thatitisunlikelytobeusefultocacheitintheoperatingsystembuffercache.Inthiscase,
theDataNodemaybeconfiguredtoautomaticallypurgealldatafromthebuffercache
afteritisdeliveredtotheclient.Thisbehaviorisautomaticallydisabledforworkloads
whichreadonlyshortsectionsofablock(e.gHBaserandomIOworkloads).Thismay
improveperformanceforsomeworkloadsbyfreeingbuffercachespageusageformore
cacheabledata.IftheHadoopnativelibrariesarenotavailable,thisconfigurationhasno
effect.
Insomeworkloads,thedatawrittentoHDFSisknowntobesignificantlylargeenough
thatitisunlikelytobeusefultocacheitintheoperatingsystembuffercache.Inthiscase,
theDataNodemaybeconfiguredtoautomaticallypurgealldatafromthebuffercache
afteritiswrittentodisk.Thismayimproveperformanceforsomeworkloadsbyfreeing

dfs.datanode.sync.behind.writes

false

dfs.client.failover.max.attempts

15

dfs.client.failover.sleep.base.millis

500

buffercachespageusageformorecacheabledata.IftheHadoopnativelibrariesarenot
available,thisconfigurationhasnoeffect.
Ifthisconfigurationisenabled,thedatanodewillinstructtheoperatingsystemtoenqueue
allwrittendatatothediskimmediatelyafteritiswritten.ThisdiffersfromtheusualOS
policywhichmaywaitforupto30secondsbeforetriggeringwriteback.Thismay
improveperformanceforsomeworkloadsbysmoothingtheIOprofilefordatawrittento
disk.IftheHadoopnativelibrariesarenotavailable,thisconfigurationhasnoeffect.
Expertonly.Thenumberofclientfailoverattemptsthatshouldbemadebeforethe
failoverisconsideredfailed.
Expertonly.Thetimetowait,inmilliseconds,betweenfailoverattemptsincreases
exponentiallyasafunctionofthenumberofattemptsmadesofar,witharandomfactorof
+/50%.Thisoptionspecifiesthebasevalueusedinthefailovercalculation.Thefirst
failoverwillretryimmediately.The2ndfailoverattemptwilldelayatleast
dfs.client.failover.sleep.base.millismilliseconds.Andsoon.

dfs.client.failover.sleep.max.millis

15000

Expertonly.Thetimetowait,inmilliseconds,betweenfailoverattemptsincreases
exponentiallyasafunctionofthenumberofattemptsmadesofar,witharandomfactorof
+/50%.Thisoptionspecifiesthemaximumvaluetowaitbetweenfailovers.Specifically,
thetimebetweentwofailoverattemptswillnotexceed+/50%of
dfs.client.failover.sleep.max.millismilliseconds.

dfs.client.failover.connection.retries

Expertonly.IndicatesthenumberofretriesafailoverIPCclientwillmaketoestablisha
serverconnection.

dfs.client.failover.connection.retries.on.timeouts

Expertonly.ThenumberofretryattemptsafailoverIPCclientwillmakeonsocket
timeoutwhenestablishingaserverconnection.

30

Expertonly.Thetimetowait,inseconds,fromreceptionofandatanodeshutdown
notificationforquickrestart,untildeclaringthedatanodedeadandinvokingthenormal
recoverymechanisms.Thenotificationissentbyadatanodewhenitisbeingshutdown
usingtheshutdownDatanodeadmincommandwiththeupgradeoption.

dfs.client.datanoderestart.timeout
dfs.nameservices

Commaseparatedlistofnameservices.
TheIDofthisnameservice.IfthenameserviceIDisnotconfiguredormorethanone
nameserviceisconfiguredfordfs.nameservicesitisdeterminedautomaticallyby
matchingthelocalnode'saddresswiththeconfiguredaddress.

dfs.nameservice.id
dfs.internal.nameservices

Commaseparatedlistofnameservicesthatbelongtothiscluster.Datanodewillreportto
allthenameservicesinthislist.Bydefaultthisissettothevalueofdfs.nameservices.

dfs.ha.namenodes.EXAMPLENAMESERVICE

Theprefixforagivennameservice,containsacommaseparatedlistofnamenodesfora
givennameservice(egEXAMPLENAMESERVICE).

dfs.ha.namenode.id

TheIDofthisnamenode.IfthenamenodeIDisnotconfigureditisdetermined
automaticallybymatchingthelocalnode'saddresswiththeconfiguredaddress.

dfs.ha.logroll.period

120

Howoften,inseconds,theStandbyNodeshouldasktheactivetorolleditlogs.Sincethe
StandbyNodeonlyreadsfromfinalizedlogsegments,theStandbyNodewillonlybeas
uptodateashowoftenthelogsarerolled.Notethatfailovertriggersalogrollsothe
StandbyNodewillbeuptodatebeforeitbecomesactive.

dfs.ha.tailedits.period

60

Howoften,inseconds,theStandbyNodeshouldcheckfornewfinalizedlogsegmentsin
thesharededitslog.

dfs.ha.automaticfailover.enabled

false

Whetherautomaticfailoverisenabled.SeetheHDFSHighAvailabilitydocumentation
fordetailsonautomaticHAconfiguration.

dfs.support.append
dfs.client.use.datanode.hostname

true
false

DoesHDFSallowappendstofiles?
Whetherclientsshouldusedatanodehostnameswhenconnectingtodatanodes.

dfs.datanode.use.datanode.hostname

false

dfs.client.local.interfaces

dfs.datanode.shared.file.descriptor.paths

/dev/shm,/tmp

dfs.short.circuit.shared.memory.watcher.interrupt.check.ms 60000
dfs.namenode.kerberos.internal.spnego.principal

${dfs.web.authentication.kerberos.principal}

dfs.secondary.namenode.kerberos.internal.spnego.principal

${dfs.web.authentication.kerberos.principal}

dfs.namenode.avoid.read.stale.datanode

false

dfs.namenode.avoid.write.stale.datanode

false

dfs.namenode.stale.datanode.interval

30000

dfs.namenode.write.stale.datanode.ratio

0.5f

dfs.namenode.invalidate.work.pct.per.iteration

0.32f

dfs.namenode.replication.work.multiplier.per.iteration

Whetherdatanodesshouldusedatanodehostnameswhenconnectingtootherdatanodes
fordatatransfer.
Acommaseparatedlistofnetworkinterfacenamestousefordatatransferbetweenthe
clientanddatanodes.Whencreatingaconnectiontoreadfromorwritetoadatanode,the
clientchoosesoneofthespecifiedinterfacesatrandomandbindsitssockettotheIPof
thatinterface.Individualnamesmaybespecifiedaseitheraninterfacename(eg"eth0"),a
subinterfacename(eg"eth0:0"),oranIPaddress(whichmaybespecifiedusingCIDR
notationtomatcharangeofIPs).
Acommaseparatedlistofpathstousewhencreatingfiledescriptorsthatwillbeshared
betweentheDataNodeandtheDFSClient.Typicallyweuse/dev/shm,sothatthefile
descriptorswillnotbewrittentodisk.Systemsthatdon'thave/dev/shmwillfallbackto
/tmpbydefault.
Thelengthoftimeinmillisecondsthattheshortcircuitsharedmemorywatcherwillgo
betweencheckingforjavainterruptionssentfromotherthreads.Thisisprovidedmainly
forunittests.

Indicatewhetherornottoavoidreadingfrom"stale"datanodeswhoseheartbeatmessages
havenotbeenreceivedbythenamenodeformorethanaspecifiedtimeinterval.Stale
datanodeswillbemovedtotheendofthenodelistreturnedforreading.See
dfs.namenode.avoid.write.stale.datanodeforasimilarsettingforwrites.
Indicatewhetherornottoavoidwritingto"stale"datanodeswhoseheartbeatmessages
havenotbeenreceivedbythenamenodeformorethanaspecifiedtimeinterval.Writes
willavoidusingstaledatanodesunlessmorethanaconfiguredratio
(dfs.namenode.write.stale.datanode.ratio)ofdatanodesaremarkedasstale.See
dfs.namenode.avoid.read.stale.datanodeforasimilarsettingforreads.
Defaulttimeintervalformarkingadatanodeas"stale",i.e.,ifthenamenodehasnot
receivedheartbeatmsgfromadatanodeformorethanthistimeinterval,thedatanodewill
bemarkedandtreatedas"stale"bydefault.Thestaleintervalcannotbetoosmallsince
otherwisethismaycausetoofrequentchangeofstalestates.Wethussetaminimumstale
intervalvalue(thedefaultvalueis3timesofheartbeatinterval)andguaranteethatthe
staleintervalcannotbelessthantheminimumvalue.Astaledatanodeisavoidedduring
lease/blockrecovery.Itcanbeconditionallyavoidedforreads(see
dfs.namenode.avoid.read.stale.datanode)andforwrites(see
dfs.namenode.avoid.write.stale.datanode).
Whentheratioofnumberstaledatanodestototaldatanodesmarkedisgreaterthanthis
ratio,stopavoidingwritingtostalenodessoastopreventcausinghotspots.
*Note*:Advancedproperty.Changewithcaution.Thisdeterminesthepercentageamount
ofblockinvalidations(deletes)todooverasingleDNheartbeatdeletioncommand.The
finaldeletioncountisdeterminedbyapplyingthispercentagetothenumberoflivenodes
inthesystem.Theresultantnumberisthenumberofblocksfromthedeletionlistchosen
forproperinvalidationoverasingleheartbeatofasingleDN.Valueshouldbeapositive,
nonzeropercentageinfloatnotation(X.Yf),with1.0fmeaning100%.
*Note*:Advancedproperty.Changewithcaution.Thisdeterminesthetotalamountof
blocktransferstobegininparallelataDN,forreplication,whensuchacommandlistis
beingsentoveraDNheartbeatbytheNN.Theactualnumberisobtainedbymultiplying
thismultiplierwiththetotalnumberoflivenodesinthecluster.Theresultnumberisthe
numberofblockstobegintransfersimmediatelyfor,perDNheartbeat.Thisnumbercan
beanypositive,nonzerointeger.

nfs.server.port

2049

SpecifytheportnumberusedbyHadoopNFS.

nfs.mountd.port

4242

nfs.dump.dir

/tmp/.hdfsnfs

SpecifytheportnumberusedbyHadoopmountdaemon.
ThisdirectoryisusedtotemporarilysaveoutoforderwritesbeforewritingtoHDFS.For
eachfile,theoutoforderwritesaredumpedaftertheyareaccumulatedtoexceedcertain
threshold(e.g.,1MB)inmemory.Oneneedstomakesurethedirectoryhasenoughspace.

nfs.rtmax

1048576

nfs.wtmax

1048576

nfs.keytab.file
nfs.kerberos.principal

nfs.allow.insecure.ports

true

dfs.webhdfs.enabled

true

hadoop.fuse.connection.timeout

300

hadoop.fuse.timer.period

dfs.metrics.percentiles.intervals

dfs.encrypt.data.transfer

false

dfs.encrypt.data.transfer.algorithm

dfs.encrypt.data.transfer.cipher.suites

dfs.encrypt.data.transfer.cipher.key.bitlength

dfs.trustedchannel.resolver.class

128

ThisisthemaximumsizeinbytesofaREADrequestsupportedbytheNFSgateway.If
youchangethis,makesureyoualsoupdatethenfsmount'srsize(addrsize=#ofbytesto
themountdirective).
ThisisthemaximumsizeinbytesofaWRITErequestsupportedbytheNFSgateway.If
youchangethis,makesureyoualsoupdatethenfsmount'swsize(addwsize=#ofbytes
tothemountdirective).
*Note*:Advancedproperty.Changewithcaution.Thisisthepathtothekeytabfilefor
thehdfsnfsgateway.Thisisrequiredwhentheclusteriskerberized.
*Note*:Advancedproperty.Changewithcaution.Thisisthenameofthekerberos
principal.Thisisrequiredwhentheclusteriskerberized.Itmustbeofthisformat:nfs
gatewayuser/nfsgatewayhost@kerberosrealm
Whensettofalse,clientconnectionsoriginatingfromunprivilegedports(thoseabove
1023)willberejected.ThisistoensurethatclientsconnectingtothisNFSGatewaymust
havehadrootprivilegeonthemachinewherethey'reconnectingfrom.
EnableWebHDFS(RESTAPI)inNamenodesandDatanodes.
Theminimumnumberofsecondsthatwe'llcachelibhdfsconnectionobjectsinfuse_dfs.
Lowervalueswillresultinlowermemoryconsumptionhighervaluesmayspeedup
accessbyavoidingtheoverheadofcreatingnewconnectionobjects.
Thenumberofsecondsbetweencacheexpirychecksinfuse_dfs.Lowervalueswillresult
infuse_dfsnoticingchangestoKerberosticketcachesmorequickly.
Commadelimitedsetofintegersdenotingthedesiredrolloverintervals(inseconds)for
percentilelatencymetricsontheNamenodeandDatanode.Bydefault,percentilelatency
metricsaredisabled.
Whetherornotactualblockdatathatisread/writtenfrom/toHDFSshouldbeencrypted
onthewire.ThisonlyneedstobesetontheNNandDNs,clientswilldeducethis
automatically.Itispossibletooverridethissettingperconnectionbyspecifyingcustom
logicviadfs.trustedchannel.resolver.class.
Thisvaluemaybesettoeither"3des"or"rc4".Ifnothingisset,thentheconfiguredJCE
defaultonthesystemisused(usually3DES.)Itiswidelybelievedthat3DESismore
cryptographicallysecure,butRC4issubstantiallyfaster.NotethatifAESissupportedby
boththeclientandserverthenthisencryptionalgorithmwillonlybeusedtoinitially
transferkeysforAES.(Seedfs.encrypt.data.transfer.cipher.suites.)
ThisvaluemaybeeitherundefinedorAES/CTR/NoPadding.Ifdefined,then
dfs.encrypt.data.transferusesthespecifiedciphersuitefordataencryption.Ifnotdefined,
thenonlythealgorithmspecifiedindfs.encrypt.data.transfer.algorithmisused.By
default,thepropertyisnotdefined.
Thekeybitlengthnegotiatedbydfsclientanddatanodeforencryption.Thisvaluemaybe
settoeither128,192or256.
TrustedChannelResolverisusedtodeterminewhetherachannelistrustedforplaindata
transfer.TheTrustedChannelResolverisinvokedonbothclientandserverside.Ifthe
resolverindicatesthatthechannelistrusted,thenthedatatransferwillnotbeencrypted
evenifdfs.encrypt.data.transferissettotrue.Thedefaultimplementationreturnsfalse
indicatingthatthechannelisnottrusted.

AcommaseparatedlistofSASLprotectionvaluesusedforsecuredconnectionstothe
DataNodewhenreadingorwritingblockdata.Possiblevaluesareauthentication,integrity
andprivacy.authenticationmeansauthenticationonlyandnointegrityorprivacy
integrityimpliesauthenticationandintegrityareenabledandprivacyimpliesallof
authentication,integrityandprivacyareenabled.Ifdfs.encrypt.data.transferissettotrue,
thenitsupersedesthesettingfordfs.data.transfer.protectionandenforcesthatall
connectionsmustuseaspecializedencryptedSASLhandshake.Thispropertyisignored
forconnectionstoaDataNodelisteningonaprivilegedport.Inthiscase,itisassumed
thattheuseofaprivilegedportestablishessufficienttrust.
SaslPropertiesResolverusedtoresolvetheQOPusedforaconnectiontotheDataNode
whenreadingorwritingblockdata.Ifnotspecified,thevalueof
hadoop.security.saslproperties.resolver.classisusedasthedefaultvalue.

dfs.data.transfer.protection

dfs.data.transfer.saslproperties.resolver.class
dfs.datanode.hdfsblocksmetadata.enabled

false

dfs.client.fileblockstoragelocations.numthreads

10

dfs.client.fileblockstoragelocations.timeout.millis

1000

dfs.journalnode.rpcaddress

0.0.0.0:8485

dfs.journalnode.httpaddress

0.0.0.0:8480

dfs.journalnode.httpsaddress

0.0.0.0:8481

dfs.namenode.audit.loggers

default

dfs.datanode.availablespacevolumechoosing
policy.balancedspacethreshold

10737418240

dfs.datanode.availablespacevolumechoosing
policy.balancedspacepreferencefraction

0.75f

dfs.namenode.edits.noeditlogchannelflush

false

Booleanwhichenablesbackenddatanodesidesupportfortheexperimental
DistributedFileSystem#getFileVBlockStorageLocationsAPI.
NumberofthreadsusedformakingparallelRPCsin
DistributedFileSystem#getFileBlockStorageLocations().
Timeout(inmilliseconds)fortheparallelRPCsmadein
DistributedFileSystem#getFileBlockStorageLocations().
TheJournalNodeRPCserveraddressandport.
TheaddressandporttheJournalNodeHTTPserverlistenson.Iftheportis0thenthe
serverwillstartonafreeport.
TheaddressandporttheJournalNodeHTTPSserverlistenson.Iftheportis0thenthe
serverwillstartonafreeport.
Listofclassesimplementingauditloggersthatwillreceiveauditevents.Theseshouldbe
implementationsoforg.apache.hadoop.hdfs.server.namenode.AuditLogger.Thespecial
value"default"canbeusedtoreferencethedefaultauditlogger,whichusesthe
configuredlogsystem.Installingcustomauditloggersmayaffecttheperformanceand
stabilityoftheNameNode.Refertothecustomlogger'sdocumentationformoredetails.
Onlyusedwhenthedfs.datanode.fsdataset.volume.choosing.policyissetto
org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy.
ThissettingcontrolshowmuchDNvolumesareallowedtodifferintermsofbytesoffree
diskspacebeforetheyareconsideredimbalanced.Ifthefreespaceofallthevolumesare
withinthisrangeofeachother,thevolumeswillbeconsideredbalancedandblock
assignmentswillbedoneonapureroundrobinbasis.
Onlyusedwhenthedfs.datanode.fsdataset.volume.choosing.policyissetto
org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy.
Thissettingcontrolswhatpercentageofnewblockallocationswillbesenttovolumes
withmoreavailablediskspacethanothers.Thissettingshouldbeintherange0.01.0,
thoughinpractice0.51.0,sincethereshouldbenoreasontopreferthatvolumeswith
lessavailablediskspacereceivemoreblockallocations.
Specifieswhethertoflusheditlogfilechannel.Whenset,expensiveFileChannel#force
callsareskippedandsynchronousdiskwritesareenabledinsteadbyopeningtheeditlog
filewithRandomAccessFile("rws")flags.Thiscansignificantlyimprovetheperformance
ofeditlogwritesontheWindowsplatform.Notethatthebehaviorofthe"rws"flagsis
platformandhardwarespecificandmightnotprovidethesamelevelofguaranteesas
FileChannel#force.Forexample,thewritewillskipthediskcacheonSASandSCSI
deviceswhileitmightnotonSATAdevices.Thisisanexpertlevelsetting,changewith
caution.
Justlikedfs.datanode.drop.cache.behind.writes,thissettingcausesthepagecachetobe

droppedbehindHDFSwrites,potentiallyfreeingupmorememoryforotheruses.Unlike
dfs.datanode.drop.cache.behind.writes,thisisaclientsidesettingratherthanasettingfor
theentiredatanode.Ifpresent,thissettingwilloverridetheDataNodedefault.Ifthe
nativelibrariesarenotavailabletotheDataNode,thisconfigurationhasnoeffect.

dfs.client.cache.drop.behind.writes

dfs.client.cache.drop.behind.reads

dfs.client.cache.readahead

dfs.namenode.enable.retrycache

true

dfs.namenode.retrycache.expirytime.millis

600000

dfs.namenode.retrycache.heap.percent

0.03f

dfs.client.mmap.enabled

true

dfs.client.mmap.cache.size

256

dfs.client.mmap.cache.timeout.ms

3600000

dfs.client.mmap.retry.timeout.ms

300000

dfs.client.short.circuit.replica.stale.threshold.ms

1800000

dfs.namenode.path.based.cache.block.map.allocation.percent 0.25

Justlikedfs.datanode.drop.cache.behind.reads,thissettingcausesthepagecachetobe
droppedbehindHDFSreads,potentiallyfreeingupmorememoryforotheruses.Unlike
dfs.datanode.drop.cache.behind.reads,thisisaclientsidesettingratherthanasettingfor
theentiredatanode.Ifpresent,thissettingwilloverridetheDataNodedefault.Ifthe
nativelibrariesarenotavailabletotheDataNode,thisconfigurationhasnoeffect.
Whenusingremotereads,thissettingcausesthedatanodetoreadaheadintheblockfile
usingposix_fadvise,potentiallydecreasingI/Owaittimes.Unlike
dfs.datanode.readahead.bytes,thisisaclientsidesettingratherthanasettingfortheentire
datanode.Ifpresent,thissettingwilloverridetheDataNodedefault.Whenusinglocal
reads,thissettingdetermineshowmuchreadaheadwedoinBlockReaderLocal.Ifthe
nativelibrariesarenotavailabletotheDataNode,thisconfigurationhasnoeffect.
Thisenablestheretrycacheonthenamenode.Namenodetracksfornonidempotent
requeststhecorrespondingresponse.Ifaclientretriestherequest,theresponsefromthe
retrycacheissent.Suchoperationsaretaggedwithannotation@AtMostOncein
namenodeprotocols.Itisrecommendedthatthisflagbesettotrue.Settingittofalse,will
resultinclientsgettingfailureresponsestoretriedrequest.Thisflagmustbeenabledin
HAsetupfortransparentfailovers.Theentriesinthecachehaveexpirationtime
configurableusingdfs.namenode.retrycache.expirytime.millis.
Thetimeforwhichretrycacheentriesareretained.
Thisparameterconfigurestheheapsizeallocatedforretrycache(excludingtheresponse
cached).Thiscorrespondstoapproximately4096entriesforevery64MBofnamenode
processjavaheapsize.Assumingretrycacheentryexpirationtime(configuredusing
dfs.namenode.retrycache.expirytime.millis)of10minutes,thisenablesretrycacheto
support7operationspersecondsustainedfor10minutes.Astheheapsizeisincreased,
theoperationratelinearlyincreases.
Ifthisissettofalse,theclientwon'tattempttoperformmemorymappedreads.
Whenzerocopyreadsareused,theDFSClientkeepsacacheofrecentlyusedmemory
mappedregions.Thisparametercontrolsthemaximumnumberofentriesthatwewill
keepinthatcache.Thelargerthisnumberis,themorefiledescriptorswewillpotentially
useformemorymappedfiles.mmapedfilesalsousevirtualaddressspace.Youmayneed
toincreaseyourulimitvirtualaddressspacelimitsbeforeincreasingtheclientmmap
cachesize.Notethatyoucanstilldozerocopyreadswhenthissizeissetto0.
Theminimumlengthoftimethatwewillkeepanmmapentryinthecachebetweenuses.
Ifanentryisinthecachelongerthanthis,andnobodyusesit,itwillberemovedbya
backgroundthread.
Theminimumamountoftimethatwewillwaitbeforeretryingafailedmmapoperation.
Themaximumamountoftimethatwewillconsiderashortcircuitreplicatobevalid,if
thereisnocommunicationfromtheDataNode.Afterthistimehaselapsed,wewillre
fetchtheshortcircuitreplicaevenifitisinthecache.
ThepercentageoftheJavaheapwhichwewillallocatetothecachedblocksmap.The
cachedblocksmapisahashmapwhichuseschainedhashing.Smallermapsmaybe
accessedmoreslowlyifthenumberofcachedblocksislargelargermapswillconsume
morememory.
Theamountofmemoryinbytestouseforcachingofblockreplicasinmemoryonthe
datanode.Thedatanode'smaximumlockedmemorysoftulimit(RLIMIT_MEMLOCK)

dfs.datanode.max.locked.memory

dfs.namenode.list.cache.directives.num.responses

100

dfs.namenode.list.cache.pools.num.responses

100

dfs.namenode.path.based.cache.refresh.interval.ms

30000

dfs.namenode.path.based.cache.retry.interval.ms

30000

dfs.datanode.fsdatasetcache.max.threads.per.volume

dfs.cachereport.intervalMsec

10000

dfs.namenode.edit.log.autoroll.multiplier.threshold

2.0

dfs.namenode.edit.log.autoroll.check.interval.ms
dfs.webhdfs.user.provider.user.pattern

300000
^[AZaz_][AZaz09._]*[$]?$

dfs.client.context

default

dfs.client.read.shortcircuit

false

dfs.domain.socket.path

mustbesettoatleastthisvalue,elsethedatanodewillabortonstartup.Bydefault,this
parameterissetto0,whichdisablesinmemorycaching.Ifthenativelibrariesarenot
availabletotheDataNode,thisconfigurationhasnoeffect.
ThisvaluecontrolsthenumberofcachedirectivesthattheNameNodewillsendoverthe
wireinresponsetoalistDirectivesRPC.
ThisvaluecontrolsthenumberofcachepoolsthattheNameNodewillsendoverthewire
inresponsetoalistPoolsRPC.
Theamountofmillisecondsbetweensubsequentpathcacherescans.Pathcacherescans
arewhenwecalculatewhichblocksshouldbecached,andonwhatdatanodes.Bydefault,
thisparameterissetto30seconds.
WhentheNameNodeneedstouncachesomethingthatiscached,orcachesomethingthat
isnotcached,itmustdirecttheDataNodestodosobysendingaDNA_CACHEor
DNA_UNCACHEcommandinresponsetoaDataNodeheartbeat.Thisparameter
controlshowfrequentlytheNameNodewillresendthesecommands.
Themaximumnumberofthreadspervolumetouseforcachingnewdataonthedatanode.
ThesethreadsconsumebothI/OandCPU.Thiscanaffectnormaldatanodeoperations.
Determinescachereportingintervalinmilliseconds.Afterthisamountoftime,the
DataNodesendsafullreportofitscachestatetotheNameNode.TheNameNodeusesthe
cachereporttoupdateitsmapofcachedblockstoDataNodelocations.Thisconfiguration
hasnoeffectifinmemorycachinghasbeendisabledbysetting
dfs.datanode.max.locked.memoryto0(whichisthedefault).Ifthenativelibrariesarenot
availabletotheDataNode,thisconfigurationhasnoeffect.
Determineswhenanactivenamenodewillrollitsowneditlog.Theactualthreshold(in
numberofedits)isdeterminedbymultiplyingthisvalueby
dfs.namenode.checkpoint.txns.Thispreventsextremelylargeeditfilesfromaccumulating
ontheactivenamenode,whichcancausetimeoutsduringnamenodestartupandposean
administrativehassle.Thisbehaviorisintendedasafailsafeforwhenthestandbyor
secondarynamenodefailtorolltheeditlogbythenormalcheckpointthreshold.
Howoftenanactivenamenodewillcheckifitneedstorollitseditlog,inmilliseconds.
Validpatternforuserandgroupnamesforwebhdfs,itmustbeavalidjavaregex.
ThenameoftheDFSClientcontextthatweshoulduse.Clientsthatshareacontextshare
asocketcacheandshortcircuitcache,amongotherthings.Youshouldonlychangethisif
youdon'twanttosharewithanothersetofthreads.
Thisconfigurationparameterturnsonshortcircuitlocalreads.
Optional.ThisisapathtoaUNIXdomainsocketthatwillbeusedforcommunication
betweentheDataNodeandlocalHDFSclients.Ifthestring"_PORT"ispresentinthis
path,itwillbereplacedbytheTCPportoftheDataNode.
Ifthisconfigurationparameterisset,shortcircuitlocalreadswillskipchecksums.Thisis
normallynotrecommended,butitmaybeusefulforspecialsetups.Youmightconsider
usingthisifyouaredoingyourownchecksummingoutsideofHDFS.

dfs.client.read.shortcircuit.skip.checksum

false

dfs.client.read.shortcircuit.streams.cache.size

256

TheDFSClientmaintainsacacheofrecentlyopenedfiledescriptors.Thisparameter
controlsthesizeofthatcache.Settingthishigherwillusemorefiledescriptors,but
potentiallyprovidebetterperformanceonworkloadsinvolvinglotsofseeks.

dfs.client.read.shortcircuit.streams.cache.expiry.ms

300000

Thiscontrolstheminimumamountoftimefiledescriptorsneedtositintheclientcache
contextbeforetheycanbeclosedforbeinginactivefortoolong.

dfs.datanode.shared.file.descriptor.paths

/dev/shm,/tmp

Commaseparatedpathstothedirectoryonwhichsharedmemorysegmentsarecreated.
TheclientandtheDataNodeexchangeinformationviathissharedmemorysegment.It
triespathsinorderuntilcreationofsharedmemorysegmentsucceeds.

dfs.client.use.legacy.blockreader.local

false

dfs.block.localpathaccess.user
dfs.client.domain.socket.data.traffic

false

dfs.namenode.rejectunresolveddntopologymapping

false

dfs.client.slow.io.warning.threshold.ms

30000

dfs.datanode.slow.io.warning.threshold.ms

300

dfs.namenode.xattrs.enabled
dfs.namenode.fslimits.maxxattrsperinode
dfs.namenode.fslimits.maxxattrsize

true
32
16384

dfs.namenode.startup.delay.block.deletion.sec

dfs.namenode.list.encryption.zones.num.responses

100

dfs.namenode.inotify.max.events.per.rpc

1000

dfs.user.home.dir.prefix

/user

dfs.datanode.cache.revocation.timeout.ms

900000

dfs.datanode.cache.revocation.polling.ms

500

dfs.datanode.block.id.layout.upgrade.threads

12

dfs.encryption.key.provider.uri
dfs.storage.policy.enabled

true

LegacyshortcircuitreaderimplementationbasedonHDFS2246isusedifthis
configurationparameteristrue.ThisisfortheplatformsotherthanLinuxwherethenew
implementationbasedonHDFS347isnotavailable.
Commaseparatedlistoftheusersallowdtoopenblockfilesonlegacyshortcircuitlocal
read.
ThiscontrolwhetherwewilltrytopassnormaldatatrafficoverUNIXdomainsocket
ratherthanoverTCPsocketonnodelocaldatatransfer.Thisiscurrentlyexperimental
andturnedoffbydefault.
Ifthevalueissettotrue,thennamenodewillrejectdatanoderegistrationifthetopology
mappingforadatanodeisnotresolvedandNULLisreturned(scriptdefinedby
net.topology.script.file.namefailstoexecute).Otherwise,datanodewillberegisteredand
thedefaultrackwillbeassignedasthetopologypath.Topologypathsareimportantfor
dataresiliency,sincetheydefinefaultdomains.Thusitmaybeunwantedbehaviorto
allowdatanoderegistrationwiththedefaultrackiftheresolvingtopologyfailed.
Thethresholdinmillisecondsatwhichwewilllogaslowiowarninginadfsclient.By
default,thisparameterissetto30000milliseconds(30seconds).
Thethresholdinmillisecondsatwhichwewilllogaslowiowarninginadatanode.By
default,thisparameterissetto300milliseconds.
WhethersupportforextendedattributesisenabledontheNameNode.
Maximumnumberofextendedattributesperinode.
Themaximumcombinedsizeofthenameandvalueofanextendedattributeinbytes.
ThedelayinsecondsatwhichwewillpausetheblocksdeletionafterNamenodestartup.
Bydefaultit'sdisabled.Inthecaseadirectoryhaslargenumberofdirectoriesandfiles
aredeleted,suggesteddelayisonehourtogivetheadministratorenoughtimetonotice
largenumberofpendingdeletionblocksandtakecorrectiveaction.
Whenlistingencryptionzones,themaximumnumberofzonesthatwillbereturnedina
batch.Fetchingthelistincrementallyinbatchesimprovesnamenodeperformance.
MaximumnumberofeventsthatwillbesenttoaninotifyclientinasingleRPCresponse.
ThedefaultvalueattemptstoamortizeawaytheoverheadforthisRPCwhileavoiding
hugememoryrequirementsfortheclientandNameNode(1000eventsshouldconsumeno
morethan1MB.)
Thedirectorytoprependtousernametogettheuser'shomedirecotry.
WhentheDFSClientreadsfromablockfilewhichtheDataNodeiscaching,the
DFSClientcanskipverifyingchecksums.TheDataNodewillkeeptheblockfileincache
untiltheclientisdone.Iftheclienttakesanunusuallylongtime,though,theDataNode
mayneedtoevicttheblockfilefromthecacheanyway.Thisvaluecontrolshowlongthe
DataNodewillwaitfortheclienttoreleaseareplicathatitisreadingwithoutchecksums.
HowoftentheDataNodeshouldpolltoseeiftheclientshavestoppedusingareplicathat
theDataNodewantstouncache.
Thenumberofthreadstousewhencreatinghardlinksfromcurrenttopreviousblocks
duringupgradeofaDataNodetoblockIDbasedblocklayout(seeHDFS6482fordetails
onthelayout).
TheKeyProvidertousewheninteractingwithencryptionkeysusedwhenreadingand
writingtoanencryptionzone.
Allowuserstochangethestoragepolicyonfilesanddirectories.
Determineswheretosavethenamespaceintheoldfsimageformatduringcheckpointing
bystandbyNameNodeorSecondaryNameNode.Userscandumpthecontentsoftheold

dfs.namenode.legacyoivimage.dir

formatfsimagebyoiv_legacycommand.Ifthevalueisnotspecified,oldformatfsimage
willnotbesavedincheckpoint.

dfs.namenode.top.enabled
dfs.namenode.top.window.num.buckets
dfs.namenode.top.num.users

true
10
10

Enablenntop:reportingtopusersonnamenode
Numberofbucketsintherollingwindowimplementationofnntop
Numberoftopusersreturnedbythetoptool

dfs.namenode.top.windows.minutes
dfs.namenode.blocks.per.postponedblocks.rescan

1,5,25
10000

commaseparatedlistofnntopreportingperiodsinminutes
NumberofblockstorescanforeachiterationofpostponedMisreplicatedBlocks.

You might also like