Fault-tolerant fidelity based on few-qubit codes: Parity-check circuits for biased error channels Dawei Jiao, Ying Li Sep 22 2020 quant-ph arXiv:2009.09726v1 Scited Scite! 9 @misc{2009.09726, author = {Dawei Jiao and Ying Li}, title = {{F}ault-tolerant fidelity based on few-qubit codes: {P}arity-check circuits for biased error channels}, year = {2020}, eprint = {2009.09726}, note = {arXiv:2009.09726v1} } Copy Citation PDF In the shallow sub-threshold regime, fault-tolerant quantum computation requires a tremendous amount of qubits. In this paper, we study the error correction in the deep sub-threshold regime. We estimate the physical error rate for achieving the logical error rates of $10^{-6} - 10^{-15}$ using few-qubit codes, i.e. short repetition codes, small surface codes and the Steane code. Error correction circuits that are efficient for biased error channels are identified. Using the Steane code, when error channels are biased with a ratio of $10^{-3}$, the logical error rate of $10^{-15}$ can be achieved with the physical error rate of $10^{-5}$, which is much higher than the physical error rate of $10^{-9}$ for depolarising errors.

Quantum erasing the memory of Wigner's friend Cyril Elouard, Philippe Lewalle, Sreenath K. Manikandan, Spencer Rogers, Adam Frank, Andrew N. Jordan Sep 22 2020 quant-ph arXiv:2009.09905v1 Scited Scite! 7 @misc{2009.09905, author = {Cyril Elouard and Philippe Lewalle and Sreenath K.~Manikandan and Spencer Rogers and Adam Frank and Andrew N.~Jordan}, title = {{Q}uantum erasing the memory of {W}igner's friend}, year = {2020}, eprint = {2009.09905}, note = {arXiv:2009.09905v1} } Copy Citation PDF The Wigner's friend paradox concerns one of the most puzzling concepts of quantum mechanics: the consistent description of multiple nested observers. Recently, a variation of Wigner's gedankenexperiment, introduced by Frauchiger and Renner, has lead to new debates about the self-consistency of quantum mechanics. We propose a simple single-photon interferometric setup implementing their scenario, and use our reformulation to shed a new light on the assumptions leading to their paradox. From our description, we argue that the three apparently incompatible properties used to question the consistency of quantum mechanics correspond to two logically distinct contexts: either assuming that Wigner has full control over his friends' lab, or conversely that some part of the labs remain unaffected by Wigner's subsequent measurements. The first context may be seen as the quantum erasure of the memory of Wigner's friend. We further show these properties are associated with observables which do not commute, and therefore cannot take well-defined values simultaneously. Consequently, the three contradictory properties never hold simultaneously.

Measurement of Gravitational Coupling between Millimeter-Sized Masses Tobias Westphal, Hans Hepach, Jeremias Pfaff, Markus Aspelmeyer physics.class-ph physics.ins-det Sep 22 2020 gr-qc quant-ph arXiv:2009.09546v1 Scited Scite! 5 @misc{2009.09546, author = {Tobias Westphal and Hans Hepach and Jeremias Pfaff and Markus Aspelmeyer}, title = {{M}easurement of {G}ravitational {C}oupling between {M}illimeter-{S}ized {M}asses}, year = {2020}, eprint = {2009.09546}, note = {arXiv:2009.09546v1} } Copy Citation PDF We demonstrate gravitational coupling between two gold spheres of approximately 1mm radius and 90mg mass. By periodically modulating the source mass position at a frequency f=12.7mHz we generate a time-dependent gravitational acceleration at the location of the test mass, which is measured off resonance in a miniature torsional balance configuration. Over an integration time of 350 hours the test mass oscillator enables measurements with a systematic accuracy of 4E-11m/s^2 and a statistical precision of 4E-12m/s^2. This is sufficient to resolve the gravitational signal at a minimal surface distance of 400mum between the two masses. We observe both linear and quadratic coupling, consistent in signal strength with a time-varying 1/r gravitational potential. Contributions of non-gravitational forces could be kept to less than 10% of the observed signal. We expect further improvements to enable the isolation of gravity as a coupling force for objects well below the Planck mass. This opens the way for precision tests of gravity in a new regime of isolated microscopic source masses.

A no-go theorem for the persistent reality of Wigner's friend's perception Philippe Allard Guérin, Veronika Baumann, Flavio Del Santo, Časlav Brukner Sep 22 2020 quant-ph arXiv:2009.09499v1 Scited Scite! 5 @misc{2009.09499, author = {Philippe Allard Guérin and Veronika Baumann and Flavio Del Santo and Časlav Brukner}, title = {{A} no-go theorem for the persistent reality of {W}igner's friend's perception}, year = {2020}, eprint = {2009.09499}, note = {arXiv:2009.09499v1} } Copy Citation PDF The notorious Wigner's friend thought experiment (and modifications thereof) has in recent years received renewed interest especially due to new arguments that force us to question some of the fundamental assumptions of quantum theory. In this paper, we formulate a no-go theorem for the persistent reality of Wigner's friend's perception, which allows us to conclude that the perceptions that the friend has of her own measurement outcomes at different times cannot "share the same reality", if seemingly natural quantum mechanical assumptions are met. More formally, this means that, in a Wigner's friend scenario, there is no joint probability distribution for the friend's perceived measurement outcomes at two different times, that depends linearly on the initial state of the measured system and whose marginals reproduce the predictions of unitary quantum theory. This theorem entails that one must either (1) propose a nonlinear modification of the Born rule for two-time predictions, (2) sometimes prohibit the use of present information to predict the future --thereby reducing the predictive power of quantum theory-- or (3) deny that unitary quantum mechanics makes valid single-time predictions for all observers. We briefly discuss which of the theorem's assumptions are more likely to be dropped within various popular interpretations of quantum mechanics.

Optimal Provable Robustness of Quantum Classification via Quantum Hypothesis Testing Maurice Weber, Nana Liu, Bo Li, Ce Zhang, Zhikuan Zhao cs.CR cs.LG Sep 22 2020 quant-ph stat.ML arXiv:2009.10064v1 Scited Scite! 3 @misc{2009.10064, author = {Maurice Weber and Nana Liu and Bo Li and Ce Zhang and Zhikuan Zhao}, title = {{O}ptimal {P}rovable {R}obustness of {Q}uantum {C}lassification via {Q}uantum {H}ypothesis {T}esting}, year = {2020}, eprint = {2009.10064}, note = {arXiv:2009.10064v1} } Copy Citation PDF Quantum machine learning models have the potential to offer speedups and better predictive accuracy compared to their classical counterparts. However, these quantum algorithms, like their classical counterparts, have been shown to also be vulnerable to input perturbations, in particular for classification problems. These can arise either from noisy implementations or, as a worst-case type of noise, adversarial attacks. These attacks can undermine both the reliability and security of quantum classification algorithms. In order to develop defence mechanisms and to better understand the reliability of these algorithms, it is crucial to understand their robustness properties in presence of both natural noise sources and adversarial manipulation. From the observation that, unlike in the classical setting, measurements involved in quantum classification algorithms are naturally probabilistic, we uncover and formalize a fundamental link between binary quantum hypothesis testing (QHT) and provably robust quantum classification. Then from the optimality of QHT, we prove a robustness condition, which is tight under modest assumptions, and enables us to develop a protocol to certify robustness. Since this robustness condition is a guarantee against the worst-case noise scenarios, our result naturally extends to scenarios in which the noise source is known. Thus we also provide a framework to study the reliability of quantum classification protocols under more general settings.

The Complexity of Constrained Min-Max Optimization Constantinos Daskalakis, Stratis Skoulakis, Manolis Zampetakis cs.LG Sep 22 2020 cs.CC math.OC arXiv:2009.09623v1 Scited Scite! 3 @misc{2009.09623, author = {Constantinos Daskalakis and Stratis Skoulakis and Manolis Zampetakis}, title = {{T}he {C}omplexity of {C}onstrained {M}in-{M}ax {O}ptimization}, year = {2020}, eprint = {2009.09623}, note = {arXiv:2009.09623v1} } Copy Citation PDF Despite its important applications in Machine Learning, min-max optimization of nonconvex-nonconcave objectives remains elusive. Not only are there no known first-order methods converging even to approximate local min-max points, but the computational complexity of identifying them is also poorly understood. In this paper, we provide a characterization of the computational complexity of the problem, as well as of the limitations of first-order methods in constrained min-max optimization problems with nonconvex-nonconcave objectives and linear constraints. As a warm-up, we show that, even when the objective is a Lipschitz and smooth differentiable function, deciding whether a min-max point exists, in fact even deciding whether an approximate min-max point exists, is NP-hard. More importantly, we show that an approximate local min-max point of large enough approximation is guaranteed to exist, but finding one such point is PPAD-complete. The same is true of computing an approximate fixed point of Gradient Descent/Ascent. An important byproduct of our proof is to establish an unconditional hardness result in the Nemirovsky-Yudin model. We show that, given oracle access to some function $f : P \to [-1, 1]$ and its gradient $

abla f$, where $P \subseteq [0, 1]^d$ is a known convex polytope, every algorithm that finds a $\varepsilon$-approximate local min-max point needs to make a number of queries that is exponential in at least one of $1/\varepsilon$, $L$, $G$, or $d$, where $L$ and $G$ are respectively the smoothness and Lipschitzness of $f$ and $d$ is the dimension. This comes in sharp contrast to minimization problems, where finding approximate local minima in the same setting can be done with Projected Gradient Descent using $O(L/\varepsilon)$ many queries. Our result is the first to show an exponential separation between these two fundamental optimization problems.

Embedding theorems for solvable groups Vitaly Roman'kov Sep 22 2020 math.GR arXiv:2009.09958v1 Scited Scite! 2 @misc{2009.09958, author = {Vitaly Roman'kov}, title = {{E}mbedding theorems for solvable groups}, year = {2020}, eprint = {2009.09958}, note = {arXiv:2009.09958v1} } Copy Citation PDF In this paper, we prove a series of results on group embeddings in groups with a small number of generators. We show that each finitely generated group $G$ lying in a variety ${\mathcal M}$ can be embedded in a $4$-generated group $H \in {\mathcal M}{\mathcal A}$ (${\mathcal A}$ means the variety of abelian groups). If $G$ is a finite group, then $H$ can also be found as a finite group. It follows, that any finitely generated (finite) solvable group $G$ of the derived length $l$ can be embedded in a $4$-generated (finite) solvable group $H$ of length $l+1$. Thus, we answer the question of V. H. Mikaelian and A.Yu. Olshanskii. It is also shown that any countable group $G\in {\mathcal M}$, such that the abelianization $G_{ab}$ is a free abelian group, is embeddable in a $2$-generated group $H\in {\mathcal M}{\mathcal A}$.

The Complexity Landscape of Distributed Locally Checkable Problems on Trees Yi-Jun Chang Sep 22 2020 cs.DS cs.DC arXiv:2009.09645v1 Scited Scite! 2 @misc{2009.09645, author = {Yi-Jun Chang}, title = {{T}he {C}omplexity {L}andscape of {D}istributed {L}ocally {C}heckable {P}roblems on {T}rees}, year = {2020}, eprint = {2009.09645}, note = {arXiv:2009.09645v1} } Copy Citation PDF Recent research revealed the existence of gaps in the complexity landscape of locally checkable labeling (LCL) problems in the LOCAL model of distributed computing. For example, the deterministic round complexity of any LCL problem on bounded-degree graphs is either $O(\log^\ast n)$ or $\Omega(\log n)$ [Chang, Kopelowitz, and Pettie, FOCS 2016]. The complexity landscape of LCL problems is now quite well-understood, but a few questions remain open. For bounded-degree trees, there is an LCL problem with round complexity $\Theta(n^{1/k})$ for each positive integer $k$ [Chang and Pettie, FOCS 2017]. It is conjectured that no LCL problem has round complexity $o(n^{1/(k-1)})$ and $\omega(n^{1/k})$ on bounded-degree trees. As of now, only the case of $k = 2$ has been proved [Balliu et al., DISC 2018]. In this paper, we show that for LCL problems on bounded-degree trees, there is indeed a gap between $\Theta(n^{1/(k-1)})$ and $\Theta(n^{1/k})$ for each $k \geq 2$. Our proof is constructive in the sense that it offers a sequential algorithm that decides which side of the gap a given LCL problem belongs to. We also show that it is EXPTIME-hard to distinguish between $\Theta(1)$-round and $\Theta(n)$-round LCL problems on bounded-degree trees. This improves upon a previous PSPACE-hardness result [Balliu et al., PODC 2019].

On Distributed Differential Privacy and Counting Distinct Elements Lijie Chen, Badih Ghazi, Ravi Kumar, Pasin Manurangsi cs.DS cs.LG Sep 22 2020 cs.CR stat.ML arXiv:2009.09604v1 Scited Scite! 2 @misc{2009.09604, author = {Lijie Chen and Badih Ghazi and Ravi Kumar and Pasin Manurangsi}, title = {{O}n {D}istributed {D}ifferential {P}rivacy and {C}ounting {D}istinct {E}lements}, year = {2020}, eprint = {2009.09604}, note = {arXiv:2009.09604v1} } Copy Citation PDF We study the setup where each of $n$ users holds an element from a discrete set, and the goal is to count the number of distinct elements across all users, under the constraint of $(\epsilon, \delta)$-differentially privacy: - In the non-interactive local setting, we prove that the additive error of any protocol is $\Omega(n)$ for any constant $\epsilon$ and for any $\delta$ inverse polynomial in $n$. - In the single-message shuffle setting, we prove a lower bound of $\Omega(n)$ on the error for any constant $\epsilon$ and for some $\delta$ inverse quasi-polynomial in $n$. We do so by building on the moment-matching method from the literature on distribution estimation. - In the multi-message shuffle setting, we give a protocol with at most one message per user in expectation and with an error of $\tilde{O}(\sqrt(n))$ for any constant $\epsilon$ and for any $\delta$ inverse polynomial in $n$. Our protocol is also robustly shuffle private, and our error of $\sqrt(n)$ matches a known lower bound for such protocols. Our proof technique relies on a new notion, that we call dominated protocols, and which can also be used to obtain the first non-trivial lower bounds against multi-message shuffle protocols for the well-studied problems of selection and learning parity. Our first lower bound for estimating the number of distinct elements provides the first $\omega(\sqrt(n))$ separation between global sensitivity and error in local differential privacy, thus answering an open question of Vadhan (2017). We also provide a simple construction that gives $\tilde{\Omega}(n)$ separation between global sensitivity and error in two-party differential privacy, thereby answering an open question of McGregor et al. (2011).

Towards quantum simulation of spin systems using continuous variable quantum devices Razieh Annabestani, Brajesh Gupt, Bhaskar Roybardhan Sep 22 2020 quant-ph arXiv:2009.09455v1 Scited Scite! 2 @misc{2009.09455, author = {Razieh Annabestani and Brajesh Gupt and Bhaskar Roybardhan}, title = {{T}owards quantum simulation of spin systems using continuous variable quantum devices}, year = {2020}, eprint = {2009.09455}, note = {arXiv:2009.09455v1} } Copy Citation PDF We study the Bosonic representation of spin Ising model with the application of simulating two level systems using continuous variable quantum processors. We decompose the time evolution of spin systems into a sequence of continuous variable logical gates and analyze their structure. We provide an estimation of quantum circuit scaling with the size of the spin lattice system. The result makes a two-way connection between discrete variable aand continuous variable models and paves the way towards building a universal quantum computer. Furthermore, we discuss the possibility of using a Gaussian Boson sampling device to estimate the ground state energy of Ising Hamiltonian. The result has application in developing hybrid classical-quantum algorithms such as continuous variable version of variational quantum eigensolver.

A Tutorial on Quantum Convolutional Neural Networks (QCNN) Seunghyeok Oh, Jaeho Choi, Joongheon Kim Sep 22 2020 quant-ph arXiv:2009.09423v1 Scited Scite! 2 @misc{2009.09423, author = {Seunghyeok Oh and Jaeho Choi and Joongheon Kim}, title = {{A} {T}utorial on {Q}uantum {C}onvolutional {N}eural {N}etworks ({QCNN})}, year = {2020}, eprint = {2009.09423}, note = {arXiv:2009.09423v1} } Copy Citation PDF Convolutional Neural Network (CNN) is a popular model in computer vision and has the advantage of making good use of the correlation information of data. However, CNN is challenging to learn efficiently if the given dimension of data or model becomes too large. Quantum Convolutional Neural Network (QCNN) provides a new solution to a problem to solve with CNN using a quantum computing environment, or a direction to improve the performance of an existing learning model. The first study to be introduced proposes a model to effectively solve the classification problem in quantum physics and chemistry by applying the structure of CNN to the quantum computing environment. The research also proposes the model that can be calculated with O(log(n)) depth using Multi-scale Entanglement Renormalization Ansatz (MERA). The second study introduces a method to improve the model's performance by adding a layer using quantum computing to the CNN learning model used in the existing computer vision. This model can also be used in small quantum computers, and a hybrid learning model can be designed by adding a quantum convolution layer to the CNN model or replacing it with a convolution layer. This paper also verifies whether the QCNN model is capable of efficient learning compared to CNN through training using the MNIST dataset through the TensorFlow Quantum platform.

Entanglement Hamiltonian Tomography in Quantum Simulation Christian Kokail, Rick van Bijnen, Andreas Elben, Benoît Vermersch, Peter Zoller Sep 22 2020 quant-ph arXiv:2009.09000v1 Scited Scite! 2 @misc{2009.09000, author = {Christian Kokail and Rick van Bijnen and Andreas Elben and Benoît Vermersch and Peter Zoller}, title = {{E}ntanglement {H}amiltonian {T}omography in {Q}uantum {S}imulation}, year = {2020}, eprint = {2009.09000}, note = {arXiv:2009.09000v1} } Copy Citation PDF Entanglement is the crucial ingredient of quantum many-body physics, and characterizing and quantifying entanglement in closed system dynamics of quantum simulators is an outstanding challenge in today's era of intermediate scale quantum devices. Here we discuss an efficient tomographic protocol for reconstructing reduced density matrices and entanglement spectra for spin systems. The key step is a parametrization of the reduced density matrix in terms of an entanglement Hamiltonian involving only quasi local few-body terms. This ansatz is fitted to, and can be independently verified from, a small number of randomised measurements. The ansatz is suggested by Conformal Field Theory in quench dynamics, and via the Bisognano-Wichmann theorem for ground states. Not only does the protocol provide a testbed for these theories in quantum simulators, it is also applicable outside these regimes. We show the validity and efficiency of the protocol for a long-range Ising model in 1D using numerical simulations. Furthermore, by analyzing data from $10$ and $20$ ion quantum simulators [Brydges \textitet al., Science, 2019], we demonstrate measurement of the evolution of the entanglement spectrum in quench dynamics.

Exploring Intensity Invariance in Deep Neural Networks for Brain Image Registration Hassan Mahmood, Asim Iqbal, Syed Mohammed Shamsul Islam cs.AI Sep 22 2020 cs.CV cs.LG arXiv:2009.10058v1 Scited Scite! 1 @misc{2009.10058, author = {Hassan Mahmood and Asim Iqbal and Syed Mohammed Shamsul Islam}, title = {{E}xploring {I}ntensity {I}nvariance in {D}eep {N}eural {N}etworks for {B}rain {I}mage {R}egistration}, year = {2020}, eprint = {2009.10058}, note = {arXiv:2009.10058v1} } Copy Citation PDF Image registration is a widely-used technique in analysing large scale datasets that are captured through various imaging modalities and techniques in biomedical imaging such as MRI, X-Rays, etc. These datasets are typically collected from various sites and under different imaging protocols using a variety of scanners. Such heterogeneity in the data collection process causes inhomogeneity or variation in intensity (brightness) and noise distribution. These variations play a detrimental role in the performance of image registration, segmentation and detection algorithms. Classical image registration methods are computationally expensive but are able to handle these artifacts relatively better. However, deep learning-based techniques are shown to be computationally efficient for automated brain registration but are sensitive to the intensity variations. In this study, we investigate the effect of variation in intensity distribution among input image pairs for deep learning-based image registration methods. We find a performance degradation of these models when brain image pairs with different intensity distribution are presented even with similar structures. To overcome this limitation, we incorporate a structural similarity-based loss function in a deep neural network and test its performance on the validation split separated before training as well as on a completely unseen new dataset. We report that the deep learning models trained with structure similarity-based loss seems to perform better for both datasets. This investigation highlights a possible performance limiting factor in deep learning-based registration models and suggests a potential solution to incorporate the intensity distribution variation in the input image pairs. Our code and models are available at https://github.com/hassaanmahmood/DeepIntense.

Composed Variational Natural Language Generation for Few-shot Intents Congying Xia, Caiming Xiong, Philip Yu, Richard Socher Sep 22 2020 cs.CL arXiv:2009.10056v1 Scited Scite! 1 @misc{2009.10056, author = {Congying Xia and Caiming Xiong and Philip Yu and Richard Socher}, title = {{C}omposed {V}ariational {N}atural {L}anguage {G}eneration for {F}ew-shot {I}ntents}, year = {2020}, eprint = {2009.10056}, note = {arXiv:2009.10056v1} } Copy Citation PDF In this paper, we focus on generating training examples for few-shot intents in the realistic imbalanced scenario. To build connections between existing many-shot intents and few-shot intents, we consider an intent as a combination of a domain and an action, and propose a composed variational natural language generator (CLANG), a transformer-based conditional variational autoencoder. CLANG utilizes two latent variables to represent the utterances corresponding to two different independent parts (domain and action) in the intent, and the latent variables are composed together to generate natural examples. Additionally, to improve the generator learning, we adopt the contrastive regularization loss that contrasts the in-class with the out-of-class utterance generation given the intent. To evaluate the quality of the generated utterances, experiments are conducted on the generalized few-shot intent detection task. Empirical results show that our proposed model achieves state-of-the-art performances on two real-world intent detection datasets.

Regularizing Attention Networks for Anomaly Detection in Visual Question Answering Doyup Lee, Yeongjae Cheon, Wook-Shin Han Sep 22 2020 cs.CV cs.LG arXiv:2009.10054v1 Scited Scite! 1 @misc{2009.10054, author = {Doyup Lee and Yeongjae Cheon and Wook-Shin Han}, title = {{R}egularizing {A}ttention {N}etworks for {A}nomaly {D}etection in {V}isual {Q}uestion {A}nswering}, year = {2020}, eprint = {2009.10054}, note = {arXiv:2009.10054v1} } Copy Citation PDF For stability and reliability of real-world applications, the robustness of DNNs in unimodal tasks has been evaluated. However, few studies consider abnormal situations that a visual question answering (VQA) model might encounter at test time after deployment in the real-world. In this study, we evaluate the robustness of state-of-the-art VQA models to five different anomalies, including worst-case scenarios, the most frequent scenarios, and the current limitation of VQA models. Different from the results in unimodal tasks, the maximum confidence of answers in VQA models cannot detect anomalous inputs, and post-training of the outputs, such as outlier exposure, is ineffective for VQA models. Thus, we propose an attention-based method, which uses confidence of reasoning between input images and questions and shows much more promising results than the previous methods in unimodal tasks. In addition, we show that a maximum entropy regularization of attention networks can significantly improve the attention-based anomaly detection of the VQA models. Thanks to the simplicity, attention-based anomaly detection and the regularization are model-agnostic methods, which can be used for various cross-modal attentions in the state-of-the-art VQA models. The results imply that cross-modal attention in VQA is important to improve not only VQA accuracy, but also the robustness to various anomalies.

Latin BERT: A Contextual Language Model for Classical Philology David Bamman, Patrick J. Burns Sep 22 2020 cs.CL arXiv:2009.10053v1 Scited Scite! 1 @misc{2009.10053, author = {David Bamman and Patrick J.~Burns}, title = {{L}atin {BERT}: {A} {C}ontextual {L}anguage {M}odel for {C}lassical {P}hilology}, year = {2020}, eprint = {2009.10053}, note = {arXiv:2009.10053v1} } Copy Citation PDF We present Latin BERT, a contextual language model for the Latin language, trained on 642.7 million words from a variety of sources spanning the Classical era to the 21st century. In a series of case studies, we illustrate the affordances of this language-specific model both for work in natural language processing for Latin and in using computational methods for traditional scholarship: we show that Latin BERT achieves a new state of the art for part-of-speech tagging on all three Universal Dependency datasets for Latin and can be used for predicting missing text (including critical emendations); we create a new dataset for assessing word sense disambiguation for Latin and demonstrate that Latin BERT outperforms static word embeddings; and we show that it can be used for semantically-informed search by querying contextual nearest neighbors. We publicly release trained models to help drive future work in this space.

Measuring justice in machine learning Alan Lundgard Sep 22 2020 cs.CY arXiv:2009.10050v1 Scited Scite! 1 @misc{2009.10050, author = {Alan Lundgard}, title = {{M}easuring justice in machine learning}, year = {2020}, eprint = {2009.10050}, doi = {10.1145/3351095.3372838}, note = {arXiv:2009.10050v1} } Copy Citation PDF How can we build more just machine learning systems? To answer this question, we need to know both what justice is and how to tell whether one system is more or less just than another. That is, we need both a definition and a measure of justice. Theories of distributive justice hold that justice can be measured (in part) in terms of the fair distribution of benefits and burdens across people in society. Recently, the field known as fair machine learning has turned to John Rawls's theory of distributive justice for inspiration and operationalization. However, philosophers known as capability theorists have long argued that Rawls's theory uses the wrong measure of justice, thereby encoding biases against people with disabilities. If these theorists are right, is it possible to operationalize Rawls's theory in machine learning systems without also encoding its biases? In this paper, I draw on examples from fair machine learning to suggest that the answer to this question is no: the capability theorists' arguments against Rawls's theory carry over into machine learning systems. But capability theorists don't only argue that Rawls's theory uses the wrong measure, they also offer an alternative measure. Which measure of justice is right? And has fair machine learning been using the wrong one?

Mapping Coalgebras I: Comonads Brice Le Grignou Sep 22 2020 math.CT arXiv:2009.10041v1 Scited Scite! 1 @misc{2009.10041, author = {Brice Le Grignou}, title = {{M}apping {C}oalgebras {I}: {C}omonads}, year = {2020}, eprint = {2009.10041}, note = {arXiv:2009.10041v1} } Copy Citation PDF In this article we describe properties of the 2-functor from the 2-category of comonads to the 2-category of functors that sends a comonad to its forgetful functor. This allows us to describe contexts where algebras over a monad are enriched tensored and cotensored over coalgebras over a comonad.

The Data Driven Flavour Model Luca Merlo Sep 22 2020 hep-ph arXiv:2009.10040v1 Scited Scite! 1 @misc{2009.10040, author = {Luca Merlo}, title = {{T}he {D}ata {D}riven {F}lavour {M}odel}, year = {2020}, eprint = {2009.10040}, note = {arXiv:2009.10040v1} } Copy Citation PDF A bottom-up approach has been adopted to identify a flavour model that agrees with present experimental measurements. The charged fermion mass hierarchies suggest that only the top Yukawa term should be present at the renormalisable level. The flavour symmetry of the Lagrangian including the fermionic kinetic terms and only the top Yukawa is then a combination of U (2) and U(3) factors. Lighter charged fermion and active neutrino masses and quark and lepton mixings arise considering specific spurion fields. The associated phenomenology is investigated and the model turns out to have almost the same flavour protection of the Minimal Flavour Violation, in both quark and lepton sectors. Promoting the spurions to be dynamical fields, the associated scalar potential is also studied and a minimum is identified such that fermion masses and mixings are correctly reproduced.

Optical geometries Anna Fino, Thomas Leistner, Arman Taghavi-Chabert gr-qc hep-th math-ph Sep 22 2020 math.DG math.MP arXiv:2009.10012v1 Scited Scite! 1 @misc{2009.10012, author = {Anna Fino and Thomas Leistner and Arman Taghavi-Chabert}, title = {{O}ptical geometries}, year = {2020}, eprint = {2009.10012}, note = {arXiv:2009.10012v1} } Copy Citation PDF We study the notion of optical geometry, defined to be a Lorentzian manifold equipped with a null line distribution, from the perspective of intrinsic torsion. This is an instance of a non-integrable version of holonomy reduction in Lorentzian geometry. These generate congruences of null curves, which play an important rôle in general relativity. Conformal properties of these are investigated. We also extend this concept to generalised optical geometries as introduced by Robinson and Trautman.

The relative L^2 index theorem for Galois coverings Moulay-Tahar Benameur Sep 22 2020 math.OA arXiv:2009.10011v1 Scited Scite! 1 @misc{2009.10011, author = {Moulay-Tahar Benameur}, title = {{T}he relative {L}^2 index theorem for {G}alois coverings}, year = {2020}, eprint = {2009.10011}, note = {arXiv:2009.10011v1} } Copy Citation PDF Given a Galois covering of complete spin manifolds where the base metric has PSC near infinity, we prove that for small enough epsilon > 0, the epsilon spectral projection of the Dirac operator has finite trace in the Atiyah von Neumann algebra. This allows us to define the L2 index in the even case and we prove its compatibility with the Xie-Yu higher index. We also deduce L2 versions of the classical Gromov-Lawson relative index theorems. Finally, we briefly discuss some Gromov-Lawson L2 invariants.

Physical Zero-Knowledge Proof for Ripple Effect Suthee Ruangwises, Toshiya Itoh Sep 22 2020 cs.CR arXiv:2009.09983v1 Scited Scite! 1 @misc{2009.09983, author = {Suthee Ruangwises and Toshiya Itoh}, title = {{P}hysical {Z}ero-{K}nowledge {P}roof for {R}ipple {E}ffect}, year = {2020}, eprint = {2009.09983}, note = {arXiv:2009.09983v1} } Copy Citation PDF Ripple Effect is a logic puzzle with an objective to fill numbers into a rectangular grid divided into rooms. Each room must contain consecutive integers starting from 1 to its size. Also, if two cells in the same row or column have the same number $x$, the space separating the two cells must be at least $x$ cells. In this paper, we propose a physical protocol of zero-knowledge proof for Ripple Effect puzzle using a deck of cards, which allows a prover to physically show that he/she knows a solution without revealing it. In particular, we develop a physical protocol that, given a secret number $x$ and a list of numbers, verifies that $x$ does not appear among the first $x$ numbers in the list without revealing $x$ or any number in the list.

Line Flow based SLAM Qiuyuan Wang, Zike Yan, Junqiu Wang, Fei Xue, Wei Ma, Hongbin Zha Sep 22 2020 cs.CV cs.RO arXiv:2009.09972v1 Scited Scite! 1 @misc{2009.09972, author = {Qiuyuan Wang and Zike Yan and Junqiu Wang and Fei Xue and Wei Ma and Hongbin Zha}, title = {{L}ine {F}low based {SLAM}}, year = {2020}, eprint = {2009.09972}, note = {arXiv:2009.09972v1} } Copy Citation PDF We propose a method of visual SLAM by predicting and updating line flows that represent sequential 2D projections of 3D line segments. While indirect SLAM methods using points and line segments have achieved excellent results, they still face problems in challenging scenarios such as occlusions, image blur, and repetitive textures. To deal with these problems, we leverage line flows which encode the coherence of 2D and 3D line segments in spatial and temporal domains as the sequence of all the 2D line segments corresponding to a specific 3D line segment. Thanks to the line flow representation, the corresponding 2D line segment in a new frame can be predicted based on 2D and 3D line segment motions. We create, update, merge, and discard line flows on-the-fly. We model our Line Flow-based SLAM (LF-SLAM) using a Bayesian network. We perform short-term optimization in front-end, and long-term optimization in back-end. The constraints introduced in line flows improve the performance of our LF-SLAM. Extensive experimental results demonstrate that our method achieves better performance than state-of-the-art direct and indirect SLAM approaches. Specifically, it obtains good localization and mapping results in challenging scenes with occlusions, image blur, and repetitive textures.

Domain-Embeddings Based DGA Detection with Incremental Training Method Xin Fang, Xiaoqing Sun, Jiahai Yang, Xinran Liu Sep 22 2020 cs.CR arXiv:2009.09959v1 Scited Scite! 1 @misc{2009.09959, author = {Xin Fang and Xiaoqing Sun and Jiahai Yang and Xinran Liu}, title = {{D}omain-{E}mbeddings {B}ased {DGA} {D}etection with {I}ncremental {T}raining {M}ethod}, year = {2020}, eprint = {2009.09959}, note = {arXiv:2009.09959v1} } Copy Citation PDF DGA-based botnet, which uses Domain Generation Algorithms (DGAs) to evade supervision, has become a part of the most destructive threats to network security. Over the past decades, a wealth of defense mechanisms focusing on domain features have emerged to address the problem. Nonetheless, DGA detection remains a daunting and challenging task due to the big data nature of Internet traffic and the potential fact that the linguistic features extracted only from the domain names are insufficient and the enemies could easily forge them to disturb detection. In this paper, we propose a novel DGA detection system which employs an incremental word-embeddings method to capture the interactions between end hosts and domains, characterize time-series patterns of DNS queries for each IP address and therefore explore temporal similarities between domains. We carefully modify the Word2Vec algorithm and leverage it to automatically learn dynamic and discriminative feature representations for over 1.9 million domains, and develop an simple classifier for distinguishing malicious domains from the benign. Given the ability to identify temporal patterns of domains and update models incrementally, the proposed scheme makes the progress towards adapting to the changing and evolving strategies of DGA domains. Our system is evaluated and compared with the state-of-art system FANCI and two deep-learning methods CNN and LSTM, with data from a large university's network named TUNET. The results suggest that our system outperforms the strong competitors by a large margin on multiple metrics and meanwhile achieves a remarkable speed-up on model updating.

NeuroDiff: Scalable Differential Verification of Neural Networks using Fine-Grained Approximation Brandon Paulsen, Jingbo Wang, Jiawei Wang, Chao Wang cs.LO cs.SE Sep 22 2020 cs.LG stat.ML arXiv:2009.09943v1 Scited Scite! 1 @misc{2009.09943, author = {Brandon Paulsen and Jingbo Wang and Jiawei Wang and Chao Wang}, title = {{N}euro{D}iff: {S}calable {D}ifferential {V}erification of {N}eural {N}etworks using {F}ine-{G}rained {A}pproximation}, year = {2020}, eprint = {2009.09943}, note = {arXiv:2009.09943v1} } Copy Citation PDF As neural networks make their way into safety-critical systems, where misbehavior can lead to catastrophes, there is a growing interest in certifying the equivalence of two structurally similar neural networks. For example, compression techniques are often used in practice for deploying trained neural networks on computationally- and energy-constrained devices, which raises the question of how faithfully the compressed network mimics the original network. Unfortunately, existing methods either focus on verifying a single network or rely on loose approximations to prove the equivalence of two networks. Due to overly conservative approximation, differential verification lacks scalability in terms of both accuracy and computational cost. To overcome these problems, we propose NeuroDiff, a symbolic and fine-grained approximation technique that drastically increases the accuracy of differential verification while achieving many orders-of-magnitude speedup. NeuroDiff has two key contributions. The first one is new convex approximations that more accurately bound the difference neurons of two networks under all possible inputs. The second one is judicious use of symbolic variables to represent neurons whose difference bounds have accumulated significant error. We also find that these two techniques are complementary, i.e., when combined, the benefit is greater than the sum of their individual benefits. We have evaluated NeuroDiff on a variety of differential verification tasks. Our results show that NeuroDiff is up to 1000X faster and 5X more accurate than the state-of-the-art tool.

CMAX++ : Leveraging Experience for Planning and Execution using Inaccurate Models Anirudh Vemula, J. Andrew Bagnell, Maxim Likhachev cs.AI Sep 22 2020 cs.RO cs.LG arXiv:2009.09942v1 Scited Scite! 1 @misc{2009.09942, author = {Anirudh Vemula and J.~Andrew Bagnell and Maxim Likhachev}, title = {{CMAX}++ : {L}everaging {E}xperience for {P}lanning and {E}xecution using {I}naccurate {M}odels}, year = {2020}, eprint = {2009.09942}, note = {arXiv:2009.09942v1} } Copy Citation PDF Given access to accurate dynamical models, modern planning approaches are effective in computing feasible and optimal plans for repetitive robotic tasks. However, it is difficult to model the true dynamics of the real world before execution, especially for tasks requiring interactions with objects whose parameters are unknown. A recent planning approach, CMAX, tackles this problem by adapting the planner online during execution to bias the resulting plans away from inaccurately modeled regions. CMAX, while being provably guaranteed to reach the goal, requires strong assumptions on the accuracy of the model used for planning and fails to improve the quality of the solution over repetitions of the same task. In this paper we propose CMAX++, an approach that leverages real-world experience to improve the quality of resulting plans over successive repetitions of a robotic task. CMAX++ achieves this by integrating model-free learning using acquired experience with model-based planning using the potentially inaccurate model. We provide provable guarantees on the completeness and asymptotic convergence of CMAX++ to the optimal path cost as the number of repetitions increases. CMAX++ is also shown to outperform baselines in simulated robotic tasks including 3D mobile robot navigation where the track friction is incorrectly modeled, and a 7D pick-and-place task where the mass of the object is unknown leading to discrepancy between true and modeled dynamics.

Feature Distillation With Guided Adversarial Contrastive Learning Tao Bai, Jinnan Chen, Jun Zhao, Bihan Wen, Xudong Jiang, Alex Kot Sep 22 2020 cs.LG stat.ML arXiv:2009.09922v1 Scited Scite! 1 @misc{2009.09922, author = {Tao Bai and Jinnan Chen and Jun Zhao and Bihan Wen and Xudong Jiang and Alex Kot}, title = {{F}eature {D}istillation {W}ith {G}uided {A}dversarial {C}ontrastive {L}earning}, year = {2020}, eprint = {2009.09922}, note = {arXiv:2009.09922v1} } Copy Citation PDF Deep learning models are shown to be vulnerable to adversarial examples. Though adversarial training can enhance model robustness, typical approaches are computationally expensive. Recent works proposed to transfer the robustness to adversarial attacks across different tasks or models with soft labels.Compared to soft labels, feature contains rich semantic information and holds the potential to be applied to different downstream tasks. In this paper, we propose a novel approach called Guided Adversarial Contrastive Distillation (GACD), to effectively transfer adversarial robustness from teacher to student with features. We first formulate this objective as contrastive learning and connect it with mutual information. With a well-trained teacher model as an anchor, students are expected to extract features similar to the teacher. Then considering the potential errors made by teachers, we propose sample reweighted estimation to eliminate the negative effects from teachers. With GACD, the student not only learns to extract robust features, but also captures structural knowledge from the teacher. By extensive experiments evaluating over popular datasets such as CIFAR-10, CIFAR-100 and STL-10, we demonstrate that our approach can effectively transfer robustness across different models and even different tasks, and achieve comparable or better results than existing methods. Besides, we provide a detailed analysis of various methods, showing that students produced by our approach capture more structural knowledge from teachers and learn more robust features under adversarial attacks.

A Deep Learning Based Analysis-Synthesis Framework For Unison Singing Pritish Chandna, Helena Cuesta, Emilia Gómez Sep 22 2020 eess.AS cs.LG arXiv:2009.09875v1 Scited Scite! 1 @misc{2009.09875, author = {Pritish Chandna and Helena Cuesta and Emilia Gómez}, title = {{A} {D}eep {L}earning {B}ased {A}nalysis-{S}ynthesis {F}ramework {F}or {U}nison {S}inging}, year = {2020}, eprint = {2009.09875}, note = {arXiv:2009.09875v1} } Copy Citation PDF Unison singing is the name given to an ensemble of singers simultaneously singing the same melody and lyrics. While each individual singer in a unison sings the same principle melody, there are slight timing and pitch deviations between the singers, which, along with the ensemble of timbres, give the listener a perceived sense of "unison". In this paper, we present a study of unison singing in the context of choirs; utilising some recently proposed deep-learning based methodologies, we analyse the fundamental frequency (F0) distribution of the individual singers in recordings of unison mixtures. Based on the analysis, we propose a system for synthesising a unison signal from an a cappella input and a single voice prototype representative of a unison mixture. We use subjective listening tests to evaluate perceptual factors of our proposed system for synthesis, including quality, adherence to the melody as well the degree of perceived unison.

DeepTag: Robust Image Tagging for DeepFake Provenance Run Wang, Felix Juefei-Xu, Qing Guo, Yihao Huang, Lei Ma, Yang Liu, Lina Wang Sep 22 2020 cs.CR arXiv:2009.09869v1 Scited Scite! 1 @misc{2009.09869, author = {Run Wang and Felix Juefei-Xu and Qing Guo and Yihao Huang and Lei Ma and Yang Liu and Lina Wang}, title = {{D}eep{T}ag: {R}obust {I}mage {T}agging for {D}eep{F}ake {P}rovenance}, year = {2020}, eprint = {2009.09869}, note = {arXiv:2009.09869v1} } Copy Citation PDF In recent years, DeepFake is becoming a common threat to our society, due to the remarkable progress of generative adversarial networks (GAN) in image synthesis. Unfortunately, existing studies that propose various approaches, in fighting against DeepFake, to determine if the facial image is real or fake, is still at an early stage. Obviously, the current DeepFake detection method struggles to catchthe rapid progress of GANs, especially in the adversarial scenarios where attackers can evade the detection intentionally, such as adding perturbations to fool DNN-based detectors. While passive detection simply tells whether the image is fake or real, DeepFake provenance, on the other hand, provides clues for tracking the sources in DeepFake forensics. Thus, the tracked fake images could be blocked immediately by administrators and avoid further spread in social networks. In this paper, we investigated the potentials of image tagging in serving the DeepFake provenance. Specifically, we devise a deep learning-based approach, named DeepTag, with a simple yet effective encoder and decoder design to embed message to the facial image, which is to recover the embedded message after various drastic GAN-based DeepFake transformation with high confidence. The embedded message could be employed to represent the identity of facial images, which further contributed to DeepFake detection and provenance. Experimental results demonstrate that our proposed approach could recover the embedded message with an average accuracy of nearly 90%. Our research finding confirms effective privacy-preserving techniques for protecting personal photos from being DeepFaked. Thus, effective proactive defense mechanisms should be developed for fighting against DeepFakes, instead of simply devising DeepFake detection methods that can be mostly ineffective in practice.

Three puzzles in cosmology Samir D. Mathur Sep 22 2020 hep-th gr-qc arXiv:2009.09832v1 Scited Scite! 1 @misc{2009.09832, author = {Samir D.~Mathur}, title = {{T}hree puzzles in cosmology}, year = {2020}, eprint = {2009.09832}, note = {arXiv:2009.09832v1} } Copy Citation PDF Cosmology presents us with several puzzles that are related to the fundamental structure of quantum theory. We discuss three such puzzles, linking them to effects that arise in black hole physics. We speculate that puzzles in cosmology may be resolved by the vecro structure of the vacuum that resolves the information paradox and the `bags of gold' problem for black holes.

Rethinking Supervised Learning and Reinforcement Learning in Task-Oriented Dialogue Systems Ziming Li, Julia Kiseleva, Maarten de Rijke Sep 22 2020 cs.CL cs.LG arXiv:2009.09781v1 Scited Scite! 1 @misc{2009.09781, author = {Ziming Li and Julia Kiseleva and Maarten de Rijke}, title = {{R}ethinking {S}upervised {L}earning and {R}einforcement {L}earning in {T}ask-{O}riented {D}ialogue {S}ystems}, year = {2020}, eprint = {2009.09781}, howpublished = {Findings of EMNLP 2020}, note = {arXiv:2009.09781v1} } Copy Citation PDF Dialogue policy learning for task-oriented dialogue systems has enjoyed great progress recently mostly through employing reinforcement learning methods. However, these approaches have become very sophisticated. It is time to re-evaluate it. Are we really making progress developing dialogue agents only based on reinforcement learning? We demonstrate how (1)~traditional supervised learning together with (2)~a simulator-free adversarial learning method can be used to achieve performance comparable to state-of-the-art RL-based methods. First, we introduce a simple dialogue action decoder to predict the appropriate actions. Then, the traditional multi-label classification solution for dialogue policy learning is extended by adding dense layers to improve the dialogue agent performance. Finally, we employ the Gumbel-Softmax estimator to alternatively train the dialogue agent and the dialogue reward model without using reinforcement learning. Based on our extensive experimentation, we can conclude the proposed methods can achieve more stable and higher performance with fewer efforts, such as the domain knowledge required to design a user simulator and the intractable parameter tuning in reinforcement learning. Our main goal is not to beat reinforcement learning with supervised learning, but to demonstrate the value of rethinking the role of reinforcement learning and supervised learning in optimizing task-oriented dialogue systems.

Identifying Causal Effects via Context-specific Independence Relations Santtu Tikka, Antti Hyttinen, Juha Karvanen Sep 22 2020 cs.AI cs.LG arXiv:2009.09768v1 Scited Scite! 1 @misc{2009.09768, author = {Santtu Tikka and Antti Hyttinen and Juha Karvanen}, title = {{I}dentifying {C}ausal {E}ffects via {C}ontext-specific {I}ndependence {R}elations}, year = {2020}, eprint = {2009.09768}, note = {arXiv:2009.09768v1} } Copy Citation PDF Causal effect identification considers whether an interventional probability distribution can be uniquely determined from a passively observed distribution in a given causal structure. If the generating system induces context-specific independence (CSI) relations, the existing identification procedures and criteria based on do-calculus are inherently incomplete. We show that deciding causal effect non-identifiability is NP-hard in the presence of CSIs. Motivated by this, we design a calculus and an automated search procedure for identifying causal effects in the presence of CSIs. The approach is provably sound and it includes standard do-calculus as a special case. With the approach we can obtain identifying formulas that were unobtainable previously, and demonstrate that a small number of CSI-relations may be sufficient to turn a previously non-identifiable instance to identifiable.

Ranky : An Approach to Solve Distributed SVD on Large Sparse Matrices Resul Tugay, Sule Gunduz Oguducu Sep 22 2020 cs.LG stat.ML arXiv:2009.09767v1 Scited Scite! 1 @misc{2009.09767, author = {Resul Tugay and Sule Gunduz Oguducu}, title = {{R}anky : {A}n {A}pproach to {S}olve {D}istributed {SVD} on {L}arge {S}parse {M}atrices}, year = {2020}, eprint = {2009.09767}, note = {arXiv:2009.09767v1} } Copy Citation PDF Singular Value Decomposition (SVD) is a well studied research topic in many fields and applications from data mining to image processing. Data arising from these applications can be represented as a matrix where it is large and sparse. Most existing algorithms are used to calculate singular values, left and right singular vectors of a large-dense matrix but not large and sparse matrix. Even if they can find SVD of a large matrix, calculation of large-dense matrix has high time complexity due to sequential algorithms. Distributed approaches are proposed for computing SVD of large matrices. However, rank of the matrix is still being a problem when solving SVD with these distributed algorithms. In this paper we propose Ranky, set of methods to solve rank problem on large and sparse matrices in a distributed manner. Experimental results show that the Ranky approach recovers singular values, singular left and right vectors of a given large and sparse matrix with negligible error.