Basicoverview
Searchengineinclusionreferstothespecificnumberofpagesincludedinawebsitebysearchengines.Themoreincluded,thefastertheindexingtime,whichprovesthatthiswebsiteismorefriendlytosearchengines..
Themorecommonlyusedsearchenginesincludebaidu(Baidu),google(Google),yahoo(Yahoo),sogou,youdao(有道),soso(searchsearch),bing(必应),360(360).
Principleofinclusion
CollecttheURLsofwebpagestobeindexed
ThenumberofwebpagesontheInternetisabsolutelyastronomical,andtherearecountlessnewwebpageseveryday.Searchenginesneedtofindtheobjectstobeindexedfirst.
AsfarasGoogleisconcerned,althoughthereiscontroversyoverwhetherthereisadifferencebetweenDeepBotandFreshBotonGoogleBot-asforwhethertocallthesetwonames,therearedifferentopinions.Ofcourse,thenameitselfisnotimportant-atleastfornowuntil.
ThemainstreamviewisthatinGoogle’srobots,thereareindeedquiteafewrobotsthatprepare"materials"fortheactualindexedpages—let'scallitFreshBothere.
——TheirtaskistoconstantlyscantheInterneteverydaytodiscoverandmaintainahugelistofURLsforDeepBottouse.Inotherwords,whenitvisitsandreadsoneofitswebpages,thepurposeisnotItisaboutindexingthispage,butfindingallthelinksinthispage.
——Ofcourse,thisseemstobeacontradictioninefficiency,whichisabitunbelievable.However,wecansimplyjudgebythefollowingway:FreshBotisnot"exclusive"whenscanningwebpages.
Inotherwords,multiplerobotslocatedindifferentGoogledatacentersmayvisitthesamepageinashortperiodoftime,suchasonedayorevenanhour,andDeepBotisindexingandcachingWhenthepageisnotsimilar,itwillnotappear.
Thatis,Googlewillrestrictrobotsinacertaindatacentertocompletethiswork,insteadoftwodatacentersindexingthesameversionofthewebpageatthesametime,ifthereisnoflawinthisstatement,Itseemsthatfromtheserveraccesslog,youcanoftenseethatGoogleBotsoriginatingfromdifferentIPshavevisitedthesamewebpagemultipletimesinashortperiodoftimetoprovetheexistenceofFreshBot.
Therefore,sometimesifyoufindthatGoogleBotfrequentlyvisitsthewebsite,don’tbetoohappytooearly.Maybeit’snotindexingwebpagesatallbutjustscanningURLs.
TheinformationrecordedbyFreshBotincludestheurlofthewebpage,TimeStamp(thetimestampofwhenthewebpagewascreatedorupdated),andtheheaderinformationofthewebpage(Note:Thispointiscontroversial,andmanypeoplebelievethatFreshBotwillnotreadit.Togettheinformationofthetargetwebpage,DeepBotwillcompletethispartofthework.
However,theauthorpreferstheformerstatement,becauseintheurllistsubmittedbyFreshBottoDeepBot,thewebsitesettingwillbeprohibited.Indexedandincludedpagesareexcludedtoimproveefficiency.Inadditiontorobots.txt,aconsiderablepartofthewebsiteisimplementedthroughthe"noindex"inthematatagwhensettingupthistypeofwebsite.TheheadofthetargetpagethatdoesnotreadthetargetpageseemstobeIfthisisnotpossible),ifthewebpageisnotaccessible,suchasnetworkinterruptionorserverfailure,FreshBotwillrecordtheurlandtryagainattheappropriatetime,butwillnotaddittotheurlsubmittedtoDeepBotuntiltheurlisaccessibleList.
Ingeneral,FreshBotoccupiesarelativelysmallamountofserverbandwidthandresources.Finally,FreshBotclassifiestherecordedinformationaccordingtodifferentprioritiesandsubmitsittoDeepBot.Accordingtothedifferentpriorities,therearemainlythefollowing:
A:Newwebpage;
B:Oldwebpage/newTimeStamp,thatis,thereisanupdatedwebpage;
C:Use301/302redirectionwebpage;
D:ComplexdynamicURL:suchasusingmultipleparametersDynamicURL,Googlemayneedadditionalworktocorrectlyanalyzeitscontent.——WiththeimprovementofGoogle’sabilitytosupportdynamicwebpages,thisclassificationmayhavebeencancelled;
E:Othertypesoffiles,suchaslinkstoPDFandDOCfiles,indexingofthesefiles,andAdditionalworkmayberequired;
F:Oldwebpage/oldTimeStamp,thatis,webpagethathasnotbeenupdated.NotethatthetimestamphereisnotbasedonthedatedisplayedintheGooglesearchresults,butwithGoogleDatecomparisonintheindexdatabase;
G:Wrongurl,thatis,thepagethatreturnsa404responsewhenaccessed.
ThepriorityisarrangedintheorderfromAtoG,decreasinginorder.Itneedstobeemphasizedthattheprioritymentionedhereisrelative.Forexample,itisalsoanewwebpage.Accordingtothequalityandquantityofthelinkstoit,thepriorityisalsoverydifferent.Ithaslinksfromrelatedauthoritativewebsites.'Spageshaveahigherpriority.
Inaddition,thepriorityreferredtohereisonlyforpageswithinthesamewebsite.Infact,differentwebsiteshavedifferentpriorities.Inotherwords,forpagesinauthoritativewebsites,eventhelowestpriorityLevel404urlmayalsohaveadvantagesovermanyothersiteswiththehighestprioritynewlycreatedwebpages.
Indexandinclusionofwebpages
Onlythenentertheactualprocessofindexingandinclusionofwebpages.Ascanbeseenfromtheaboveintroduction,theURLlistsubmittedbyFreshBotisquitelarge.Accordingtothelanguage,websitelocation,etc.,theindexingofspecificwebsiteswillbeallocatedtodifferentdatacenters.
Theentireindexingprocess,duetothehugeamountofdata,maytakeseveralweeksorevenlongertocomplete.
Asmentionedabove,DeepBotwillfirstindexhigherprioritywebsites/webpages.Thehigherthepriority,thefasteritwillappearintheGoogleindexdatabaseandeventuallyappearintheGooglesearchresultspage..
Foranewwebpage,aslongasitentersthisstage,eveniftheentireindexingprocessisnotcompleted,thecorrespondingwebpagehasthepossibilitytoappearintheGoogleindexlibrary.Ibelievemanyfriendsuse"site"inGoogle."Whensearching,Ioftenseepagesmarkedassupplementaryresultsthatonlydisplaythewebpageurloronlydisplaythepagetitleandurlwithoutdescription.Thisisthenormalresultofthewebpageatthisstage.
WhenGoogleactuallyreads,analyzes,andcachesthispage,itwillpickoutthesupplementaryresultsanddisplaynormalinformation.
——Ofcourse,thepremiseisthatthewebpagehasenoughlinks,especiallylinksfromauthoritativewebsites,andtherearenorecordsthatarethesameorsimilartothewebpagecontentintheindexlibrary(DuplicateContentfiltering).
FordynamicURLs,althoughGooglenowclaimsthattherearenoobstaclestoitsprocessing,theobservablefactsstillshowthatdynamicURLsaremorelikelytoappearinsupplementaryresultsthanstaticURLs.Webpagesoftenrequiremoreandmorevaluablelinkstoescapefromsupplementaryresults.
Forthe"F"categoryabove,thatis,webpagesthathavenotbeenupdated,DeepBotwillcompareitstimestampwiththedateintheGoogleindexdatabasetoconfirmthatalthoughthecorrespondingpageinformationinthesearchresultsmaybeavailableinthefutureUpdatebutaslongasthelatestversionisindexed-considerthesituationofmultipleupdatesandmodificationsofthewebpage-;asforthe"G"category,whichis404url,itwilllookupwhetherthereisacorrespondingrecordintheindexlibrary,anddeleteitifithas.
Synchronizationbetweendatacenters
Aswementionedearlier,whenDeepBotindexesawebpage,itwillbecompletedbyaspecificdatacenterinsteadofmultipledatacentersreadingatthesametime.Thewebpageobtainsthelatestversionofthewebpagerespectively.Inthisway,aftertheindexingprocessiscompleted,adatasynchronizationprocessisrequiredtoupdatethelatestversionofthewebpageinmultipledatacenters.
ThisisthefamousGoogleDancebefore.However,aftertheBigDaddyupdate,thesynchronizationbetweendatacentersisnolongerconcentratedinaspecifictimeperiodlikethat,butinacontinuousandmoretime-sensitivemanner.
Influencinginclusion
Sitetitle
Thewritingofsitetitle,description,andkeywordshasalwaysbeenaverycautiousthinginthemindsofwebmasters.Itisdirectlyrelatedtotherankingandtrafficofthewebsite,andthesethreetagscannotbeeasilymodifiedafterthewebsiteisonline.Thisrequireswebmasterstoprepareinadvance.Ifyoudonotconsideritinadvanceandmodifyitaftergoingonline,BaiduwillthinkyouYourwebsiteisunstable,youmodifythekeytagsassoonasyougoonline,andyouaresuspectedofcheating,andthenthrowyourwebsiteintothesandbox,andslowlyinvestigate.Atthistime,ifyouwantBaidutoincludethewebsiteatleastonemonthlater,andguaranteethisperiodoftimeAddhigh-qualityarticlestothewebsiteeveryday.
Externallinks
Addingexternallinkscanallowsearchenginestoefficientlycrawlandincludewebpages.
Websitecontent
Originalwebsitecontentiseasiertobeincluded,andmethodssuchascollectingandcopyingotherpeople'sinformationaregenerallydifficulttoinclude.
Thebiggestadvantageoforiginalarticlesisthattheycanservemultiplepurposes,increasetheprobabilityofawebsitebeingincludedbysearchengines,andimprovewebsiteoptimizationrankings.
FeaturesofBaidu
1.TheinformationprocessingmethodbasedonwordcombinationcleverlysolvestheproblemofunderstandingChineseinformation,andgreatlyimprovestheaccuracyandrecallofsearch.
2.SupportmainstreamChineseencoding,includinggbk(Chinesecharacterinnercodeextensionspecification),gb2312(simplified),big5(traditional),andcanbeconvertedbetweendifferentencodings."
3.Theintelligentrelevancealgorithmusesacombinationofcontent-basedandhyperlink-basedanalysismethodsforrelevanceevaluation,whichcanobjectivelyanalyzetheinformationcontainedinwebpages,therebymaximizingtherelevanceofsearchresults.
4.Thesearchresultsaremoreintuitiveandcanindicaterichpageattributes(suchastitle,URL,time,size,encoding,abstract,etc.),andhighlighttheuser'squerystring,whichisconvenientforuserstojudgewhethertoreadtheoriginaltext.
5.Baidusearchsupportssecondarysearch,whichcancontinuetosearchinthelastsearchresults,andgraduallynarrowthesearchscopeuntilitreachesthesmallestandmostaccurateresultset.ItismoreconvenientforuserstofindinthemassiveinformationThecontentthatyouarereallyinterestedin.
6.Theintelligentrecommendationtechnologyofrelatedsearchtermswillprompttherelatedsearchtermsaftertheusersearchesforthefirsttimetohelpusersfindmorerelevantresults.StatisticsshowthatitcanpromotethesearchIncreasedvolumeby10-20%.
7.High-performanceserversandlocalizedserversusemulti-threadingtechnology,efficientsearchalgorithms,stableunixplatforms,andlocalizedserverstoensurethefastestresponseSpeed.BaidusearchengineprovidessearchservicesinChina,whichcangreatlyshortentheresponsetimeofretrieval(theaverageresponsetimeofaretrievalislessthan0.5seconds).
8.Itcanprovidemultipleservicemethodswithin7days.ItistheChinesesearchenginewiththefastestupdatetimeandthelargestamountofdataatpresent.9.Thesearchresultoutputcategoryaggregationsupportscontentaggregation,websiteaggregation,contentaggregation+websitecategory.Avarietyofmethodssuchasgathering.Supportuserstoselecttimerangeandimproveuserretrievalefficiency.
10.Intelligentandscalablesearchtechnologyhastheworld’slargestChineseinformationdatabase,providinguserswiththemostaccurate,Themostextensiveandtime-sensitiveinformationprovidesasolidfoundation.
11.Theoptimizeddistributedstructureofstructureandalgorithm,thewell-designedoptimizationalgorithm,andthefault-tolerantdesignensurethesystem'shighperformanceunderalargenumberofvisits.Usability,highscalability,highperformanceandhighstability.
12.Highconfigurabilityenablesthesearchservicetomeettheneedsofdifferentusers.
13.AdvancedwebpagedynamicsummaryDisplaytechnology.
14.UniqueBaidusnapshot.
15.Supportsavarietyofadvancedsearchsyntax,makinguserquerymoreefficientandmoreaccurateresults."+"(and),"-"(not),"|"(or),"site:","domain:","intitle:","inurl",andotherefficientsearchsyntaxwillcontinuetobeadded.
Increasingtheinclusion
Basically,afterthesearchenginehasincludedthesite,andyoucanalreadyseethenumberofsearchenginesincluded,thehopemustbetoallowthesearchenginetoincludemorepages.IfyouwanttoincreaseThenumberofsearchenginesincluded,alargeincreaseinthecontentofthewebsiteisoneofthem.MoreneedstobedoneforthespidersofsearchenginesTheprogramcreatesagoodwebsitestructure.Toincreasethesite’sinclusionrate,youcantakethefollowingmethods:
Improvetheexternalchain
TheexternalchainisagoodmedicineforSEO,whetheritistoimprovethesearchenginerankingorincreasethewebsite’sinclusionVolume,especiallyhigh-qualityexternallinks.Theworkoflinkbuildingmustaccompanythesearchengineoptimizationprogramfromthebeginningtotheend.
Addoriginalcontent
Onceoriginalcontentisincludedbysearchengines,suchcontentpagesarenotsoeasytobedeletedbysearchengines.Ifthecontentofawebsitehasahighrepetitionrate,evenafteritisincludedbysearchengines,itiseasytobecleanedupbysearchenginesonaregularbasis.Keepingacertainpercentageoforiginalcontentonthewebsitecancultivatetheweightofthewebsiteandensurethatsearchengineswillnotincludeanddeletethesepages.
Optimizethestructure
Optimizetheinternallinksofthewebsite.Agoodwebsitestructurewillallowspiderstofollowthelinksandreadthecontentofthewebsitelayerbylayer.Websiteswithpoorwebsitestructurewillmakespidersfeelliketheyhaveenteredamaze.Ifyourwebsiteisverylarge,itisbesttoestablishuserexperienceapplicationssuchasclearwebsitenavigation,comprehensivesitemaps,etc.,whichcanguidetheinclusionandfacilitatetheusersofthewebsite.
ResearchCollection
Thecollectionprocedureofthesearchengineisacollectionwithonlythinkinganddistinguishingability.Let'snottreatitasasimplewebsitecontentporter.Whenitreadsyourcontent,itwilldistinguishthevalueofthesecontentandotheraspects.Asawebsiteadministrator,youhavetostudytherulesofinclusion,crawlingrules,etc.,anddealingwiththeinclusionofsearchenginesisalsoanimportantsubject.Forincreasingthenumberofpagesincludedonthewebsite,wehavetomakeourselvesmoreproactive.Inotherwords,itmeanstotaketheinitiative.Insteadofwaitingforthecollectiontocome,itisbettertoguidethecollection.
SiteMap
Asitemapisalsocalledasitemap.Itisapageonwhichlinkstoallpagesonthewebsiteareplaced.Whenmostpeoplecannotfindtheinformationtheyneedonthewebsite,theymayusethesitemapasaremedy.Thesearchengineindexlikesthesitemapverymuch.
Whybuildasitemap?Mostpeopleknowthatsitemapsaregoodforimprovingtheuserexperience:theyprovidedirectionstositevisitorsandhelplostvisitorsfindthepagetheywanttosee.Forsearchengineoptimization,thebenefitsofthesitemapareevenmore:
1.Providelinksforbrowsingtheentirewebsiteforsearchengines.
2.Providesomelinksforsearchenginestoincludelinkstodynamicpagesorpagesthataredifficulttoreachbyothermethods.
3.Asapotentiallandingpage,itcanbeoptimizedforsearchtraffic.
4.IfavisitortriestoaccessaURLthatdoesnotexistinthedomainwherethewebsiteislocated,thevisitorwillberedirectedtoanerrorpageof"Filecannotbefound",andthesitemapcanbeusedasthe"Quasi"content.
Newsiteincluded
Baidudoesnotincludethenewsitesolution:
(1)ItisbesttowaitforallthecontentsofthewebsitetobecompletedbeforeuploadingtothewebsiteSpace
(2)Afterthewebsiteisuploaded,submitthewebsitetoBaidu:loginportalsofseveralmajorsearchengines
(3)Register3-5accountsinBaiduSoucang,ThenfavoriteURLs
(4)GotoLeshou,CapeofGoodHopeandothernetworkfavoriteURLs
(5)GotoBaiduTieba,A5andotherhigh-weightwebsitestopublishlinkbait(withOwnwebsite),tolureBaidutoincludeitandcrawlit
(6)Regularlyupdate2-5originalarticleseverydayforthefirstmonth
(7)Don’tuseSEOcheatingMethodoptimization
Basicallyfollowtheabovesteps,thehomepagecanbeincludedwithin1-30days.IfonemonthhaspassedandtheURLhasnotbeenincluded,youcantrytomodifythelayoutofthehomepage.