Publications

Francesco Orabona, Dávid Pál
ScaleFree Online Learning
Summary:
This is the journal version of the ALT 2015 paper. We design and analyze 3
algorithms for online linear optimization that are invariant to scaling of the
loss vectors. Two of them are instances of Follow The Regularized Leader (FTRL),
the third is an instance of Mirror Descent. We show regret bounds for all 3
algorithms. One of the FTRLbased algorithms works for any domain, bounded or
unbouded. In contrast, we give examples demonstrating that the
MirrorDescentbased algorithm has at least linear regret if the Bregman
divergence induced by the regularizer is unbounded; this happens, in particular,
when the domain is unbounded.
Bib info: Submitted to the journal Theorerical Computer Science (ALT 2015 special issue)
Download:
[arXiv]
[GitHub]
[sources]

Francesco Orabona, Dávid Pál
Optimal NonAsymptotic Lower Bound on the Minimax Regret of Learning with Expert Advice
Summary:
We show a nonasymptotic optimal lower bounds on the expected value of maximum
of independent Gaussians and the expected value of maximum of independent
symmetric random walks. Both bounds are asymptotically optimal, i.e. they imply
known limit theorems for these quantities. Simple application of the lower
bound for symmetric random walks is a lower bound of the regret for the problem
of learning with expert advice which is a classical online learning problem.
Bib info: draft
Download:
[arXiv]

Satyen Kale, Chansoo Lee, Dávid Pál
Hardness of Online Sleeping Combinatorial Optimization Problems
Summary:
We show computational hardness of sleeping variants of six combinatorial
sequential decision making problems (Online Shortest Paths, Online Minimum
Spanning Tree, Online kSubsets, Online kTruncated Permutations,
Online Minimum Cut, and Online Bipartite Matching). We show that sleeping
variants of these problems are as hard as PAC learning DNF expressions.
Bib info: draft
Download:
[arXiv]

Francesco Orabona, Dávid Pál
ScaleFree Algorithms for Online Linear Optimization
Summary:
We design algorithms for online linear optimization that are invariant to
scaling of the loss vectors. Scale invariance, as a design principle, almost
instantly leads to better algorithms and cleaner regret analysis. We design and
analyze two scalefree variants of Follow The Regularized Leader Algorithm. The
two algorithms are simple, fully adaptive and work with any strongly convex
regularizer.
Bib info: Proceedings of ALT 2015
Download:
[PDF]
[arXiv]
Slides
[1]
[2]
[3]

Gábor Bartók, Dávid Pál, Csaba Szepesvári, István Szita
Online Learning Lecture Notes
Summary:
In spring 2011, Csaba taught an online learning course with occasional help from
Gábor and István and me. As we were teaching, we were continuously writing
lecture notes. The focus of the course was noregret nonstochastic learning
(learning with expert advice, switching experts, follow the regularized leader
algorithm, online linear regression), and brief explanation of onlinetobatch
conversion and multiarmed bandits. Our goal was to develop the theory from
first principles.
Bib info: lecture notes
Download:
[PDF]

Yasin AbbasiYadkori, Dávid Pál, Csaba Szepesvári
OnlinetoConfidenceSet Conversions and Application to Sparse Stochastic Bandits
Summary:
We show how to convert a regret bound for an online algorithm for linear
prediction under quadratic loss (e.g. online least squares, online gradient,
pnorm algorithm) into a confidence set for the weight vector of the
linear model under any subGaussian noise. The lower the regret of the
algorithm, the smaller the confidence set. As an application, we use these
confidence sets to construct online algorithms for the linear stochastic bandit
problem. Using an online linear prediction algorithm for sparse models, we are
able to construct an algorithm for the sparse linear stochastic bandit problem
i.e. problem where the underlying linear model is sparse.
Bib info: Proceedings of AISTATS 2012
Download:
[PDF],
[Slides],
[Overview Talk]

Yasin AbbasiYadkori, Dávid Pál, Csaba Szepesvári
Improved Algorithms for Linear Stochastic Bandits
Summary:
We provide an upper bound for the tail probability of the norm of a
vectorvalued martingale. This bound immediately gives rise to a confidence set
(ellipsoid) for online regularized linear leastsquares estimator. We apply
these confidence sets to the stochastic multiarmed bandit problem and the
stochastic linear bandit problem. We show that the regret of the modified UCB
algorithm is constant for any fixed confidence parameter! Furthermore, we
improve the previously known bounds for the stochastic linear bandit problem by
a polylogarithmic factor.
Bib info: Proceedings of NIPS 2011
Download:
[PDF],
[arXiv]

Gábor Bartók, Dávid Pál, Csaba Szepesvári
Minimax Regret of Finite PartialMonitoring Games in Stochastic Environments
Summary:
We continue with our ALT 2010 paper by classification of minimax regret of all
partial games with finite number of actions and outcomes. However, we restrict
ourselves to stochastic environments, i.e. the environment chooses outcomes
i.i.d. from a fixed but unknown probability distribution. We show that regret
of any game falls into one of the four categories: zero,
T^{1/2}, T^{2/3} or T. We provide
algorithms that achieve that regret within logarithmic factor. We believe that
this result can be lifted to adversarial setting also; however that's left as a
future work.
Bib info: Proceedings of COLT 2011
Download:
[PDF]

Gábor Bartók, Dávid Pál, Csaba Szepesvári
Toward a Classification of Partial Monitoring Games
Summary:
Partial monitoring games are online learning problems where a decision maker
repeatedly makes a decision, receives a feedback and suffers a loss. Multiarmed
bandits and dynamic pricing are special cases. We give characterization of
minmax regret of all finite partial monitoring games with two outcomes. We show
that regret of any such game falls into one of the four categories: zero,
T^{1/2}, T^{2/3} or T. Our characterization
is nice and simple combinatorial/geometrical condition on the loss matrix.
Bib info: Proceedings of ALT 2010, Gábor received for this paper the best student paper award.
Download:
[PDF],
[arXiv]

Dávid Pál, Barnabás Póczos, Csaba Szepesvári
Estimation of Rényi Entropy and Mutual Information Based on Generalized NearestNeighbor Graphs
Summary:
We consider simple and computationally efficient nonparametric estimators of
Rényi entropy and mutual information based on an i.i.d. sample drawn from an
unknown, absolutely continuous distribution over R^{d}. The
estimators are calculated as the sum of pth powers of the Euclidean
lengths of the edges of the `generalized nearestneighbor' graph of the sample
and the empirical copula of the sample respectively. Under mild conditions we
prove the almost sure consistency of the estimators. In addition, we derive
high probability error bounds assuming that the density underlying the sample is
Lipschitz continuous.
Bib info: Proceedings of NIPS 2010
Download:
conference version: [PDF],
extended version with proofs: [PDF], [arXiv],
poster: [PDF]

Shai BenDavid, Dávid Pál, Shai ShalevShwartz
Agnostic Online Learning
Summary:
We generalize the Littlestone's online learning model to the agnostic setting
where no hypothesis makes zero error. Littlestone defined a combinatorial
parameter of the hypothesis class, which we call Littlestone's dimension and
which determines the worstcase number of prediction mistakes made by an online
learning algorithm in the realizable setting. Point of our paper is that
Littlestone's dimension characterizes learnability in the agnostic case as well.
Namely, we give upper and lower bounds on the regret in terms of the
Littlestone's dimension.
This is a similar story to what happened to the Valiant's PAC model. The key
quantity there is VapnikChervonekis dimension. Valiant's PAC model is the
``realizable case''. Our work can be paralleled to what Haussler and others did
in 1992 when they generalized the PAC model to the agnostic setting and showed
that VapnikChervonekis dimension remains the key quantity characterizing
learnability there as well.
Bib info: Proceedings of COLT 2009
Download:
[PDF],
[slides],
[slides sources]

Ph.D. thesis
Contributions to Unsupervised and SemiSupervised Learning
Summary:
My thesis is essentially a compilation of the two older papers on clustering
stability and an part of the paper on semisupervised learning. The formal gaps,
especially in the first paper on clustering stability, were filled in and
presentation was improved.
Download:
[PDF],
[sources]

Tyler Lu, Dávid Pál, Martin Pál
Showing Relevant Ads via Lipschitz Context MultiArmed Bandits
Summary:
We study context multiarmed bandit problems where the context comes from
a metric space and the payoff satisfies a Lipschitz condition with respect to
the metric. Abstractly, a context multiarmed bandit problem models a situation
in which, in a sequence of independent trials, an online algorithm chooses an
action based on a given context (side information) from a set of possible
actions so as to maximize the total payoff of the chosen actions. The payoff
depends on both the action chosen and the context. In contrast, contextfree
multiarmed bandit problems, studied previously, model situations where no side
information is available and the payoff depends only on the action chosen.
For concreteness we focus in this paper on an application to Internet search
advertisement where the task is to display ads to a user of a Internet search
engine based on his search query so as to maximize the clicktrough rate of the
ads displayed. We cast this problem as a context multiarmed bandit problem
where queries and ads form metric spaces and the payoff function is Lipschitz
with respect to both the metrics. We present an algorithm, give upper bound on
its regret and show an (almost) matching lower bound on the regret of any
algorithm.
Bib info: Proceedings of AISTATS 2010
Download:
[PDF],
[slides],
[slides sources]

Shai BenDavid, Tyler Lu, Dávid Pál, Miroslava Sotáková
Learning LowDensity Separators
Summary:
We define a novel, basic, unsupervised learning problem  learning the lowest
density homogeneous hyperplane separator of an unknown probability distribution.
This task is relevant to several problems in machine learning, such as
semisupervised learning and clustering stability. We investigate the question
of existence of a universally consistent algorithm for this problem. We propose
two natural learning paradigms and prove that, on input unlabeled random samples
generated by any member of a rich family of distributions, they are guaranteed
to converge to the optimal separator for that distribution. We complement this
result by showing that no learning algorithm for our task can achieve uniform
learning rates (that are independent of the data generating distribution).
Bib info: Proceedings of AISTATS 2009
Download:
[PDF]
[arXiv]

Shai BenDavid, Tyler Lu, Dávid Pál
Does Unlabeled Data Provably Help? Worstcase Analysis of the Sample Complexity of SemiSupervised Learning
Summary:
We mathematically study the potential benefits of the use of unlabeled data to
decrease classification error. We propose a simple model of semisupervised
learning (SSL) in which the unlabeled data distribution is perfectly known to
the learner. We compare this model with the standard PAC model (and its agnostic
version) for supervised learning, in which the unlabeled distribution is uknown
to the learner. Does there exists a supervised algorithm which for any unlabeled
distribution needs in the worst case (over possible targets) at most by a
constant factor more labeled samples than any SSL learner? It seems that the
ERM algorithm is such a candidate. We prove, for some special cases, that this
indeed the case.
Bib info: Proceedings of COLT 2008
Note: See also Tyler's Master's thesis.
Download:
[PDF] [sources]

Gagan Aggarwal, S. Muthukrishnan, Dávid Pál, Martin Pál
General Auction Mechanism for Search Advertising
Summary:
We propose a new auction mechanism for search advertising that generalizes both
Generalized Second Price auction (which is currently, in 2008, used by Google,
Yahoo! and MSN) and the famous VickreyClarkeGroves mechanism adapted to the
search auctions. Our mechanism allows each bidder to specify a subset of slots
in which (s)he is interested, and the value and the maximum price of each slot.
For the auctioneer (the search engine) it allows to specify for every
slotbidder pair a reserve (minimum) price. Our auction computes the
bidderoptimal stable matching. The running time is O(nk^{3}),
where n is the number of bidders and k is the number of slots. We
also prove that our auction mechanism is truthrevealing, that is, dominant
strategy of each bidder is not to lie.
Bib info: Proceedings of WWW 2009
Download:
[PDF]

Shai BenDavid, Dávid Pál, Hans Ulrich Simon
Stability of Kmeans Clusterings
Summary:
This is followup of the previous paper. We consider the clustering algorithm
A which minimizes the kmeans cost precisely. (Such algorithm is not
realistic, since the optimization problem is known to be NPhard. Nevertheless,
the classical Lloyd's heuristic a.k.a. "the kmean algorithm" tries to do this.)
We analyze how stable is the clustering outputted by A with respect to a
random draw of the sample points. Given a probability distribution, draw two
i.i.d. samples of the same size. If with high probability, the clustering of
the first sample does not differ from the clustering of the second sample too
much, we say that A is stable on the probability distribution. For
discrete probability distributions we show that instability happens if and only
if the probability distribution has two or more clusterings with optimal kmeans
cost.
Bib info: Proceedings of COLT 2007
Download:
[PDF] (extended version),
[PDF] (short conference version)

Shai BenDavid, Dávid Pál, Ulrike von Luxburg
A Sober look at Clustering Stability
Summary:
We investigate how stable is the clustering outputted by a clustering algorithm
with respect to a random draw of the sample points. Given a probability
distribution, draw two i.i.d. samples of the same size. If with high
probability, the clustering of the first sample does not differ from the
clustering of the second sample too much, we say that algorithm is stable on the
probability distribution. We show that for that for algorithms that optimize
some cost function stability depends on the uniqueness of the optimal solution
of the optimization problem for the "infinite" sample represented by the
probability distribution.
Bib info: Proceedings of COLT 2006, Best Student Paper Award
Download:
[PDF],
[Slides in PDF],
[Slides source files]

Dávid Pál, Martin Škoviera
Colouring Cubic Graphs by Small Steiner Triple Systems
Summary:
Steiner triple system is a collection of 3element subsets (called
triples) of a finite set of points, such that every two points appear together
in exactly one triple. Given Steiner triple system S, one can ask whether it is
possible to color the edges of a cubic graph with points of S, in such way that
colors of the three edges at a vertex form a triple of S. We construct a small
Steiner triple system (of order 21) which colors all simple cubic graphs.
Note: This is the main result of my "Magister" Thesis at Comenius University under supervision of my advisor Martin Škoviera, 2004.
Bib info: Graphs and Combinatorics, 2007
Download:
[PDF],
[Postscript],
[Source files]

Coauthors
Yasin AbbasiYadkori
Gagan Aggarwal
Gábor Bartók
Shai BenDavid
Satyen Kale
Chansoo Lee
Tyler Lu
Ulrike von Luxburg
S. Muthukrisnan
Francesco Orabona
Martin Pál
Barnabás Póczos
Hans Ulrich Simon
Miroslava Sotáková
Martin Škoviera
Shai ShalevShwartz
Csaba Szepesvári
