Caracterización de la base intelectual disciplinaria en el período 1995-2008:

la perspectiva bibliométrica

4.2 Caracterización de la base intelectual disciplinaria en el período 1995-2008:

Inthis paper,we studiedthe problemoflearning the CTRsof adsinsponsored search auctionswith truthful mecha-nisms.Thisproblemishighlychallengingsinceitrequiresthecombinationofonlinelearningtools(i.e.,regretminimization

18 FromthisexperimentisnotclearwhetherR RT= ˜O(q⁻_min¹),thusimplyingthatRT doesnotdependonqminatall,orR RT issublinearinqmin,which wouldcorrespondtoadependencyRT= ˜^O(q⁻_min^z)with0<z<1.

Fig. 14. Dependency of the relative regret R RT on N.

algorithms)andeconomictools(i.e.,truthfulmechanisms).Whilealmostalltheliteraturefocusedonsingle-slotscenarios, herewefocusedonmulti-slotscenarios.Withmultipleslotsitisnecessarytoadoptausermodeltocharacterizehowthe CTR ofanadvariesastheallocationofdisplayedadsvaries.Here,weadoptedthecascademodel,thatisthemostcommon modelusedintheliterature.Inthepaper,westudiedanumberofscenarios,eachwithaspeciﬁcinformationsettingof un-knownparameters.Foreachscenario,wedesignedatruthfullearningmechanism,studieditseconomicproperties,derived an upperbound overtheregret,and, forsome mechanisms,alsoalower bound.We consideredboth theregretover the auctioneer’srevenueandtheSW.

We showedthat forthe cascade modelwith only position-dependent externalities it is possible to design a truthful no-regret learning mechanismforthegeneralcasein whichall theparameters are unknown.Ourmechanismpresents a regret O˜(T²³) andit is DSIC in expectation w.r.t. therealization ofthe random component ofthe mechanism. However, it remains open whetheror not it is possible to obtain a regret O˜(T¹²). For speciﬁc cases, in which some parameters are known to the auctioneer, we obtained better results in terms of either incentive compatibility, obtaining dominant strategy truthfulness,orregret,obtaining a regretofzero.Weshowedthat forthecascademodelwiththeposition- and ad-dependentexternalitiesitispossibletodesignaDSIC aposteriori mechanismwitharegretO˜(T²³)whenonlythequality isunknown.Instead,evenwhenthecascademodelisonlywithad-dependentexternalitiesandnoparameterisknown,it isnot possibletoobtainano-regretDSIC aposteriori mechanism.Theproof ofthisresultwouldseemtosuggestthat the sameresultholdsalsowhena randommechanismisadoptedandthetruthfulness isinexpectationw.r.t.its realizations.

However, we didnotproduceanyproof forthat, leavingit forfutureworks.Finally,we empiricallyevaluatedthebounds weprovided,showingthatthedependencyoftheregretontheparametersismostlycorrectinaworst-casescenario.

Twomain questionsdeservefuture investigation.The ﬁrstquestion concerns thestudyofa lower bound forthecase in which thereare only position-dependentexternalities andtruthfulness isin expectationinexpectationw.r.t. only the realizationsoftherandomcomponentofthemechanismoralsow.r.t.theclickrealizations.Furthermore,itisopenwhether theseparationofexplorationandexploitationphasesisnecessaryand,inthenegativecase,whetheritispossibletoobtain aregret O˜(T¹²).Thesecondquestionconcernsasimilarstudyrelatedtothecasewithonlyad-dependentexternalities.

Appendix A. Vickrey–Clarke–Grovesmechanism

Consideragenericdirect-revelationmechanismM= (N ,V,,f,{^pi}i∈N)asdeﬁnedinSection3.2.Differentlyfromthe SSA case,ingeneralthetypeofanagent,denotedbyviforconsistencywiththerestofthepaper,isavectorofparameters.

Wedefineafunctionval_i: ×V → R⁺^,^which^returns^the^valueôbtained^byâgentâiwhenitstypeisv_iandtheallocation chosenbythemechanismisθ.

TheVCG mechanismisobtainedcouplingthetwofollowingfunctions:

•^the^allocation^function ^{f which}^returns^the^allocationmaximizing thesocialwelfare,i.e., f

(

ˆ ) =

^{arg max}

θ∈SW

(θ,

ˆ ) =

^{arg max}

θ∈

i∈N

val_i

(θ,

ˆ

) ;

•^the^payment^rule ^pi,whichdeﬁnesthepaymentrequiredfromagentai,i.e., pi

(ˆ

) =

^SW

(

(ˆ

v₋i

),

ˆ

₋i

) −

^SW−ⁱ

(

ˆ ),

ˆ )

=

j∈N ,^j=ⁱ

val_j

(

(ˆ

v₋_i

),

ˆ

) −

j∈N ,^j=ⁱ

val_j

(

(ˆ

),

ˆ

),

wherewedenoteby f(ˆv₋i)theallocationreturnedby f whenagenti doesnotparticipatetotheauction.

Inthisquasi-linearenvironment,whentherearenointerdependenciesamongthetypesoftheagentsandthe no-single-agenteffect[3]holds,theVCG mechanismisAE, DSIC aposteriori,IR aposteriori,andWBB aposteriori.

Appendix B. MonotonicityandMyerson’spayments

Consideragenericdirect-revelationmechanismM= (N ,V,,f,{^pi}i∈N)asdeﬁnedinSection3.2.Asingle-parameter linearenvironmentissuchthat

• ^the^typeôfêachâgentâiisascalar vi(single-parameterassumption),

• ^the^utility ^function ^of^{agent a}i isu_i(ˆ^v)=^zi

Anallocationfunction f ismonotonic inasingle-parameterlinearenvironmentifforany^vˆ−i

zi possibletodesignaDSIC mechanismimposingthefollowingpayments[35]:

p_i

(

ˆ ) =

^hi

(ˆ

^v₋i

) +

^zi Appendix C. ProofofrevenueregretinTheorem 2

WestartbyreportingtheproofofProposition 1.

ProofofProposition 1. Thederivation is a simpleapplicationof the Hoeffding’sbound.We ﬁrst notice that each ofthe termsintheempiricalaverageq˜i(Eq.(11))isboundedin[⁰;¹/_π(i;θt)]^.^Thus^we^obtain

Byreorderingthetermsinthepreviousexpressionwehave

η =

whichguaranteesthatalltheempiricalestimates^q˜iarewithin

η

ofq_i foralltheadswithprobability,atleast, 1− δ^. 2 Beforestatingthemainresultofthissection,weneedthefollowinglemma.

Lemma1.Foranyslots_mwithm∈K^,^withprobability1− δ^,

Proof. TheproofisastraightforwardapplicationofProposition 1.Weconsidertheoptimalallocationθ^∗ deﬁnedinEq.(2) andtheoptimalallocation ˜θwhenestimatesq˜⁺areadopteddeﬁnedinEq.(16).Wedenoteh=

α

(m;θ^∗)∈^{arg max}

i∈N(qivˆi;^m), i.e.,theindexoftheadallocatedinagenericslotinpositionm.Therearetwopossiblescenarios:

• ^If

π

(h;˜θ)<m (thead is displayed into ahigher slotin the approximatedallocation ˜θ), then ∃^j∈N ^s.t.

π

(j;θ^∗)<

m∧

π

(j;˜θ)≥^m.^Thus

maxi∈N

(˜

q⁺_i v

ˆ

;

) ≥ ˜

q⁺_jv

ˆ

≥

qjv

ˆ

≥

qhv

ˆ

=

max

i∈N

(

qiv

ˆ

;

)

wherethesecondinequalityholdswithprobability1− δ^;

•^If

π

(h;˜θ)≥^{m (the}âdîs^displayedîntoâ^lowerôrêqual^slotⁱⁿ^theapproximatedallocation ˜θ),then maxi∈N

(˜

q⁺_i v

ˆ

;

) ≥ ˜

q⁺_hv

ˆ

≥

qhvh

=

max

i∈N

(

qiv

ˆ

;

)

wherethesecondinequalityholdswithprobability 1− δ^. Inbothcases,thestatementfollows. 2

ProofofTheorem 2.

Step1:expectedpayments. The proof follows steps similar to those in theproofs in [20]. We ﬁrst recall that since the mechanismisDSIC in expectationw.r.t.the clicks,then wecan directlyfocusontheregretwhen theactualvalues v are bid.Foranyadaisuchthat

π

(i;θ^∗)≤^{K ,}^the^expected^payments^of^theVCG mechanisminthiscasereducetoEq.(9):

while,giventhedeﬁnitionofA-VCG1 reportedinSection4.1,theexpectedpaymentsforatt-thiterationoftheauctionare

˜

Step2:per-stepexplorationregret. Sinceforany1≤^t≤

τ

,A-VCG1setsallthepaymentsto0,theper-stepregretis r_t

=

Step3:per-stepexploitationregret. Now we focus on the expected (w.r.t. click realizations) per-step regret during the exploitation phase.According to thedeﬁnition ofpayments, ateach stept∈ {

τ

+¹,. . . ,T} ^of^the exploitation phase we boundtheper-stepregretr as

=

Bydeﬁnitionofthemaxoperator,sincel+1>m,itfollowsthat max

withprobabilityatleast1− δ^.^Notice^that,^by^deﬁnition^ofl,K

l=ml= m− K+¹= m.Furthermore,fromthe defini-tionof^q˜⁺_i ândûsingÊq.⁽¹⁴⁾^we^have^that^forânyâdâi:

˜

q⁺_i

−

^qi

= ˜

^qi

−

^qi

+ η _≤

η ,

withprobabilityatleast1− δ.Thus,thedifferencebetweenthepaymentsbecomes rt

≤

^2vmax

Step4:cumulativeregret. Weﬁrstconsiderthe(low-probability)eventinwhichtheboundonq˜⁺_i derivedinProposition 1.

Inthiscase,we cannotguaranteeanythingaboutthebehaviorofthemechanism, sincethepaymentsare veryinaccurate estimatesoftheCTRs, andthusthelargestpossibleregretissuffered.Inparticular,weconsidertheworstcaselossofvmax foreachslotforeach step,leadingto atotalregretof v_maxK

m=¹m

T withprobability δ.Bysummingup theregrets reportedinEq.(C.3)duringtheexplorationphaseandEq.(C.6)duringtheexploitationphaseandbyconsideringthatthese boundsholdwithprobabilityatleast1− δ(upper-boundedby1inthefollowing),weobtainanexpectedregret

≤

^vmax

where R_ei istheupperboundonthe regretsufferedduring theexploitationphase (whichholdswithprobability atleast 1− δ^), ^Rer istheupperboundontheregretsufferedduring theexploitationphase(which holdswithprobabilityatleast 1− δ⁾^and^Rδ istheupperboundontheregretwhentheboundsdonothold(withprobabilityatmostδ).Thisboundcan befurthersimpliﬁed,giventhat_K

m=¹m≤^{K ,}^as

Step5:parametersoptimization. BesidedescribingtheperformanceofA-VCG1,thepreviousboundalsoprovidesguidance fortheoptimizationoftheparameters

τ

andδ.WeﬁrstsimplifytheboundinEq.(C.7)as

R_T

≤

^vmaxK deriva-tiveofthepreviousboundw.r.t.

τ

,setittozeroandobtain

v_maxK

Substitutingthisvalueof

τ

intoEq.(C.8)leadstotheoptimizedbound

≤

^vmaxK

19 Noticethatinthelogarithmictermthefactorof2wehaveinProposition 1disappearssinceinthisproofweonlyneedtheone-sidedversionofthe bound.

Wearenowleftwiththechoiceoftheconﬁdenceparameterδ∈ (⁰,1),whichcanbeeasilysettooptimizetheasymptotic rate(i.e.,ignoringconstantsandlogarithmicfactors)as

δ =

^K⁻¹³^T⁻¹³^N¹³

Wethusobtaintheﬁnalbound

R_T

≤

^4vmax

⁻

2 3

minK²³T²³N¹³

log K¹³T¹³N²³

.

We havetoimposetheconstraintsthat T> ^N_K (givenby δ <1)andthat T>

τ

,i.e., T> ^N

K²_minlog^N_δ.The twoconstraints imply:

>

In document Ciencia de la Información y Paradigma Social: (página 108-118)

Caracterización de la base intelectual disciplinaria en el período 1995-2008:

la perspectiva bibliométrica

4.2 Caracterización de la base intelectual disciplinaria en el período 1995-2008:

(

ˆ ) =

(θ,

ˆ ) =

(θ,

ˆ

) ;

(ˆ

) =

(

(ˆ

),

ˆ

) −

(

(

ˆ ),

ˆ )

=

(

(ˆ

),

ˆ

) −

(

(ˆ

),

ˆ

),

(

ˆ ) =

(ˆ

) +

η =

η

α

π

π

π

(˜

ˆ

;

) ≥ ˜

ˆ

≥

ˆ

≥

ˆ

=

(

ˆ

;

)

π

(˜

ˆ

;

) ≥ ˜

ˆ

≥

=

(

ˆ

;

)

π

˜

τ

=

τ

=

˜

−

= ˜

−

+ η ≤

η ,

+ η _≤