Theconceptandfunctionofthehashtable
Theelementsinthehashtablearedeterminedbythehashfunction.ThekeyKofthedataelementisusedasanindependentvariable,andthevaluecalculatedthroughacertainfunctionalrelationship(calledahashfunction)isthestorageaddressoftheelement.Itisexpressedas:
Addr=H(key)
Therefore,twomainproblemsneedtobesolvedbeforeestablishingahashtable:
⑴ConstructasuitableHashfunction
ThevalueofuniformityH(key)isevenlydistributedinthehashtable;
Simpletoimprovethespeedofaddresscalculation
⑵Conflict
Conflict:Inthehashtable,differentkeyvaluescorrespondtothesamestoragelocation.Thatis,thekeywordK1≠K2,butH(K1)=H(K2).Auniformhashfunctioncanreduceconflicts,butconflictscannotbeavoided.Afteraconflictoccurs,itmustberesolved;thatis,thenextavailableaddressmustbefound.
Methodstoresolveconflicts:
⑴Linkmethod(zippermethod).Storetherecordswiththesamehashaddressinalinearlinkedlist.Forexample,intheremaindermethod,thekeywordis(18,14,01,68,27,55,79),andthedivisoris13.Thehashaddressis(5,1,1,3,1,3,1),andthehashtableisshowninthefigure.
⑵Openaddressingmethod.Ifh(k)isalreadyoccupied,searchaccordingtothefollowingsequence:(h(k)+p⑴)%TSize,(h(k)+p⑵)%TSize,…,(h(k)+p(i))%TSize,...
Amongthem,h(k)isthehashfunction,TSizeisthelengthofthehashtable,andp(i)istheprobefunction.Onthebasisofh(k)+p(i-1))%TSize,ifaconflictisfound,theincrementp(i)isusedtoperformanewdetectionuntilnoconflictoccurs.Amongthem,accordingtothedifferentprobefunctionp(i),theopenaddressingmethodisdividedintolinearprobemethod(p(i)=i:1,2,3,...)andsecondprobemethod(p(i)=(-)1)^(i-1)*((i+1)/2)^2,thesequenceofdetectionis:1,-1,4,-4,9…),randomdetectionmethod(p(i):randomNumber),doublehashfunctionmethod(doublehashfunctionh(key),hp(key)Ifh(key)conflicts,thenusehp(key)tofindthehashaddress.Theprobesequenceis:h(k),h(k)+hp(k),...,h(k)+i*hp(k)).
(3)Bucketaddressingmethod.Bucket:Alargeenoughstoragespace.Bucketaddressing:Associateabucketforeachaddressinthetable.Ifthebucketisfull,openaddressingcanbeusedtodealwithit.Forexample,insertA5,A2,A3,B5,A9,B2,B9,C2,anduselinearprobingtoresolveconflicts.Asshown.
Howtoconstructahashtable
Directaddressing
Forexample:Thereisademographictablefrom1to100yearsold,whereageisthekeyWord,thehashfunctiontakesthekeyworditself.
Numericalanalysismethod
Thebirthdaydataofsomestudentsisasfollows:
Year.Month.Day
75.10.03
75.11.23
76.03.02
76.07.12
75.04.21
76.02.15
...
Afteranalysis,thefirstplace,thesecondplace,andthethirdplacearelikelytoberepeated.Takingthesethreeplaceswillincreasethechanceofconflict,sotrynottotakethefirstthreeplaces.,Thelastthreearebetter.
Squarethemiddlemethod
Takethemiddlefewdigitsafterthekeywordsquareasthehashaddress.
Foldingmethod
Splitthekeywordintoseveralpartswiththesamedigits(thedigitsofthelastpartcanbedifferent),andthentakethesuperimposedsumoftheseparts(roundup)Asahashaddress,thismethodiscalledfolding.
Forexample:everyWestern-languagebookhasaninternationalstandardbooknumber,whichisa10-digitdecimalnumber.Ifyouwanttouseitasakeytoestablishahashtable,whenthetypeoflibrarybookisnotWhenitreaches10,000,thismethodcanbeusedtoconstructafour-digithashfunction.
Divideandleaveremaindermethod
Theremainderobtainedbydividingthekeywordbyanumberpnotgreaterthanthelengthmofthehashtableisthehashaddress.
H(key)=keyMODp(p<=m)
randomnumbermethod
choosearandomfunction,taketherandomfunctionvalueofthekeyIsitshashaddress,ie
H(key)=random(key),whererandomisarandomfunction.Thismethodisusuallyusedwhenthekeywordlengthisnotequal.
Ifthehashfunctionandconflicthandlingmethodareknown,thestepstobuildthehashtableareasfollows:
Step1.TakeoutthekeyofadataelementandcalculateitinthehashtableThestorageaddressinD=H(key).IfthestoragespacewiththestorageaddressDisnotoccupied,thedataelementisstored;otherwise,aconflictoccurs,andStep2isexecuted.
Step2.Accordingtothestipulatedconflicthandlingmethod,calculatethestorageaddressunderthedataelementwhosekeyisthekey.Ifthestoragespaceofthestorageaddressisnotoccupied,saveit;otherwise,continuetoexecuteStep2untilastorageaddressthatisnotoccupiedbythestoragespaceisfound.
Conflict
Nomatterhowelaboratethehashfunctiondesignis,therewillbeconflicts,thatis,theresultsofthetwokeywordprocessingfunctionsaremappedinthesameposition.Therefore,therearesomeMethodcanavoidconflicts.
Zippermethod
Pulloutadynamiclinkedlistinsteadofastaticsequentialstoragestructure,whichcanavoidtheconflictofthehashfunction,butthedisadvantageisthatthedesignofthelinkedlististoocumbersome,whichincreasesthecomplexityofprogramming.Thismethodcancompletelyavoidtheconflictofthehashfunction.
Multi-hashmethod
Designingtwoormorehashfunctionscanavoidconflicts,butthereisstillachanceofconflict.Thebetterormorethefunctionisdesigned,theprobabilitycanbereducedMinimize(unlessthecharacteristoobad,itisalmostimpossibletoconflict).
Openaddressmethod
Theopenaddressmethodhasaformula:Hi=(H(key)+di)MODmi=1,2,...,k(k<=m-1)
Amongthem,misthetablelengthofthehashtable.diistheincrementalsequencewhenaconflictoccurs.Ifthevalueofdimaybe1,2,3,...m-1,calllineardetectionandthenhash.
Ifditakes1,thenaftereachconflict,movebackwardby1position.Ifditakes1,-1,4,-4,9,-9,16,-16,...k*k,-k*k(k<=m/2)
Itiscalledtheseconddetectionandthenhashing.Ifthevalueofdimaybeapseudo-randomnumbersequence.Itiscalledpseudo-randomdetectionandthenhashing.
Domainbuildingmethod
Assumingthatthevaluerangeofthehashfunctionis[0,m-1],setthevectorHashTable[0..m-1]asthebasictable,andSetupastoragespacevectorOverTable[0..v]tostoreconflictingrecords.