http://ppgsonoma.com/wp-json/oembed/1.0/embed?url=http://ppgsonoma.com/portfolio-item/public-housing-4/ 1,476 reviews of ICLR 2019 conference were released in last early November. Reviewers’ comments are also available from the Open Review website for ICLR2019. However, there was no a feature for sorting by review score, and there were so many papers that I didn’t know what to read first, so I make a sorted list with review scores.

http://meandmydouble.com/.youtube.com/embed/wuHTJKX064U There are two scales of review score: Rating and Confidence. The Rating is the score the reviewer gives to each paper. In other words, whether to accept or reject a paper and it represent the quality of the paper. The Confidence indicates how reviewers understand the area they are reviewing. If the reviewer is confident in his/her opinion, he/her give it as a high score. The higher the numbers for both is better in most cases. However, I just calculated it as a rough job on averaging the Ratings and sorted it in decreasing order.

### The meaning of Rating

- 1: Trivial or wrong
- 2: Strong rejection
- 3: Clear rejection
- 4: Ok but not good enough – rejection
- 5: Marginally below acceptance threshold
- 6: Marginally above acceptance threshold
- 7: Good paper, accept
- 8: Top 50% of accepted papers, clear accept
- 9: Top 15% of accepted papers, strong accept
- 10: Top 5% of accepted papers, seminal paper

### The meaning of Confidence

- 1: The reviewer’s evaluation is an educated guess
- 2: The reviewer is willing to defend the evaluation, but it is quite likely that the reviewer did not understand central parts of the paper
- 3: The reviewer is fairly confident that the evaluation is correct
- 4: The reviewer is confident but not absolutely certain that the evaluation is correct
- 5: The reviewer is absolutely certain that the evaluation is correct and very familiar with the relevant literature

### Note

2 to 5 reviewers per paper, but usually 3 to 4.

I sorted the papers in descending order with a Rating average. In the list below, the meaning of each value is rank, an average of all Ratings, a standard deviation of Ratings, all Rating scores/ all Confidence scores and title.

In addition, after I made the list below and

However, some scores are correct on the bottom list because it’s more recent updates than the site. So I’m going to look at what I made.

If that’s helpful, look at the list below and let me know if you find any interesting papers.

- 8.67, 1.25, [9, 10, 7] [3, 5, 3]GENERATING HIGH FIDELITY IMAGES WITH SUBSCALE PIXEL NETWORKS AND MULTIDIMENSIONAL UPSCALING
- 8.33, 0.94, [9, 9, 7] [4, 5, 3]Benchmarking Neural Network Robustness to Common Corruptions and Perturbations
- 8.33, 1.25, [10, 7, 8] [4, 3, 4]Large Scale GAN Training for High Fidelity Natural Image Synthesis
- 8.33, 1.70, [9, 6, 10] [5, 4, 5]ALISTA: Analytic Weights Are As Good As Learned Weights in LISTA
- 8.00, 0.82, [7, 9, 8] [5, 5, 3]Near-Optimal Representation Learning for Hierarchical Reinforcement Learning
- 8.00, 0.82, [8, 7, 9] [4, 3, 4]Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks
- 8.00, 0.82, [7, 9, 8] [4, 5, 4]Slimmable Neural Networks
- 8.00, 0.00, [8, 8, 8] [4, 2, 5]Enabling Factorized Piano Music Modeling and Generation with the MAESTRO Dataset
- 8.00, 0.82, [7, 9, 8] [5, 4, 4]Temporal Difference Variational Auto-Encoder
- 8.00, 1.63, [8, 10, 6] [3, 4, 3]Variational Discriminator Bottleneck: Improving Imitation Learning, Inverse RL, and GANs by Constraining Information Flow
- 8.00, 0.00, [8, 8, 8] [3, 4, 3]Unsupervised Learning of the Set of Local Maxima
- 8.00, 0.82, [7, 8, 9] [4, 4, 5]An Empirical Study of Example Forgetting during Deep Neural Network Learning
- 8.00, 0.82, [7, 9, 8] [4, 4, 5]Posterior Attention Models for Sequence to Sequence Learning
- 7.67, 0.47, [7, 8, 8] [3, 4, 3]Smoothing the Geometry of Probabilistic Box Embeddings
- 7.67, 1.25, [8, 6, 9] [4, 4, 4]ON RANDOM DEEP AUTOENCODERS: EXACT ASYMPTOTIC ANALYSIS, PHASE TRANSITIONS, AND IMPLICATIONS TO TRAINING
- 7.67, 0.94, [9, 7, 7] [4, 2, 3]Slalom: Fast, Verifiable and Private Execution of Neural Networks in Trusted Hardware
- 7.67, 1.70, [6, 10, 7] [4, 3, 3]Identifying and Controlling Important Neurons in Neural Machine Translation
- 7.67, 1.25, [6, 8, 9] [5, 4, 4]Critical Learning Periods in Deep Networks
- 7.67, 1.25, [8, 9, 6] [4, 4, 4]Sparse Dictionary Learning by Dynamical Neural Networks
- 7.67, 1.70, [7, 10, 6] [4, 4, 4]KnockoffGAN: Generating Knockoffs for Feature Selection using Generative Adversarial Networks
- 7.67, 0.47, [8, 7, 8] [3, 3, 4]Learning Unsupervised Learning Rules
- 7.67, 0.94, [9, 7, 7] [3, 4, 4]Learning Robust Representations by Projecting Superficial Statistics Out
- 7.67, 0.94, [9, 7, 7] [5, 5, 4]A2BCD: Asynchronous Acceleration with Optimal Complexity
- 7.67, 0.47, [8, 7, 8] [4, 4, 4]Pay Less Attention with Lightweight and Dynamic Convolutions
- 7.67, 1.25, [8, 9, 6] [4, 4, 4]Supervised Community Detection with Line Graph Neural Networks
- 7.67, 0.47, [8, 7, 8] [2, 4, 3]Robustness May Be at Odds with Accuracy
- 7.67, 0.47, [7, 8, 8] [4, 4, 3]Kernel Change-point Detection with Auxiliary Deep Generative Models
- 7.67, 0.47, [8, 8, 7] [4, 4, 4]Adaptive Input Representations for Neural Language Modeling
- 7.67, 0.47, [7, 8, 8] [4, 3, 3]A Variational Inequality Perspective on Generative Adversarial Networks
- 7.67, 0.94, [7, 9, 7] [4, 4, 4]Composing Complex Skills by Learning Transition Policies with Proximity Reward Induction
- 7.67, 0.47, [7, 8, 8] [4, 3, 4]Towards Robust, Locally Linear Deep Networks
- 7.50, 0.50, [7, 8, 8, 7] [4, 3, 3, 4]On the Minimal Supervision for Training Any Binary Classifier from Only Unlabeled Data
- 7.50, 2.29, [7, 10, 9, 4] [4, 4, 5, 4]Exploration by random network distillation
- 7.33, 1.70, [9, 8, 5] [4, 3, 4]Understanding and Improving Interpolation in Autoencoders via an Adversarial Regularizer
- 7.33, 1.25, [7, 6, 9] [3, 4, 3]Learning Localized Generative Models for 3D Point Clouds via Graph Convolution
- 7.33, 0.47, [7, 7, 8] [4, 2, 3]Dynamic Sparse Graph for Efficient Deep Learning
- 7.33, 0.47, [8, 7, 7] [5, 4, 3]Differentiable Learning-to-Normalize via Switchable Normalization
- 7.33, 0.47, [7, 8, 7] [4, 4, 3]Learning to Remember More with Less Memorization
- 7.33, 1.25, [7, 9, 6] [3, 5, 4]Large-Scale Study of Curiosity-Driven Learning
- 7.33, 0.47, [7, 8, 7] [1, 5, 5]Evaluating Robustness of Neural Networks with Mixed Integer Programming
- 7.33, 0.47, [8, 7, 7] [4, 3, 3]Small nonlinearities in activation functions create bad local minima in neural networks
- 7.33, 0.47, [7, 7, 8] [3, 3, 2]Approximability of Discriminators Implies Diversity in GANs
- 7.33, 0.47, [7, 7, 8] [4, 3, 4]Diversity is All You Need: Learning Skills without a Reward Function
- 7.33, 0.47, [8, 7, 7] [4, 4, 5]Deep Frank-Wolfe For Neural Network Optimization
- 7.33, 1.25, [9, 7, 6] [3, 3, 3]ProMP: Proximal Meta-Policy Search
- 7.33, 0.47, [7, 8, 7] [2, 2, 4]Efficient Training on Very Large Corpora via Gramian Estimation
- 7.33, 1.25, [6, 9, 7] [5, 4, 4]Gradient descent aligns the layers of deep linear networks
- 7.33, 0.94, [6, 8, 8] [3, 4, 4]Deep Decoder: Concise Image Representations from Untrained Non-convolutional Networks
- 7.33, 0.47, [7, 8, 7] [5, 4, 3]Time-Agnostic Prediction: Predicting Predictable Video Frames
- 7.33, 2.36, [4, 9, 9] [4, 4, 5]Biologically-Plausible Learning Algorithms Can Scale to Large Datasets
- 7.33, 0.47, [7, 8, 7] [5, 4, 4]Towards Metamerism via Foveated Style Transfer
- 7.33, 0.47, [7, 7, 8] [5, 5, 5]Improving Differentiable Neural Computers Through Memory Masking, De-allocation, and Link Distribution Sharpness Control
- 7.33, 0.47, [8, 7, 7] [4, 5, 3]LanczosNet: Multi-Scale Deep Graph Convolutional Networks
- 7.33, 0.47, [8, 7, 7] [4, 4, 3]Visualizing and Understanding Generative Adversarial Networks
- 7.33, 1.25, [9, 6, 7] [4, 4, 4]Label super-resolution networks
- 7.00, 0.82, [7, 8, 6] [4, 4, 4]Deep, Skinny Neural Networks are not Universal Approximators
- 7.00, 0.82, [8, 7, 6] [3, 5, 2]DARTS: Differentiable Architecture Search
- 7.00, 1.41, [6, 9, 6] [3, 5, 4]Diffusion Scattering Transforms on Graphs
- 7.00, 1.63, [7, 5, 9] [5, 3, 4]ADVERSARIAL DOMAIN ADAPTATION FOR STABLE BRAIN-MACHINE INTERFACES
- 7.00, 0.00, [7, 7, 7] [4, 2, 2]CoT: Cooperative Training for Generative Modeling of Discrete Data
- 7.00, 0.82, [6, 7, 8] [3, 4, 4]An analytic theory of generalization dynamics and transfer learning in deep linear networks
- 7.00, 0.00, [7, 7, 7] [5, 3, 3]Deterministic Variational Inference for Robust Bayesian Neural Networks
- 7.00, 0.82, [6, 7, 8] [3, 4, 4]Learning Latent Superstructures in Variational Autoencoders for Deep Multidimensional Clustering
- 7.00, 0.00, [7, 7, 7] [3, 4, 4]EMI: Exploration with Mutual Information Maximizing State and Action Embeddings
- 7.00, 0.00, [7, 7, 7] [3, 4, 3]Learning a SAT Solver from Single-Bit Supervision
- 7.00, 0.00, [7, 7, 7] [5, 4, 4]Generative Code Modeling with Graphs
- 7.00, 0.82, [8, 6, 7] [4, 2, 4]Meta-Learning Probabilistic Inference for Prediction
- 7.00, 0.00, [7, 7, 7] [4, 3, 4]Relaxed Quantization for Discretized Neural Networks
- 7.00, 0.82, [7, 8, 6] [4, 2, 4]Greedy Attack and Gumbel Attack: Generating Adversarial Examples for Discrete Data
- 7.00, 0.00, [7, 7, 7] [4, 4, 5]Auxiliary Variational MCMC
- 7.00, 1.41, [9, 6, 6] [4, 4, 5]SNIP: SINGLE-SHOT NETWORK PRUNING BASED ON CONNECTION SENSITIVITY
- 7.00, 1.63, [7, 5, 9] [3, 4, 4]Deep Graph Infomax
- 7.00, 0.00, [7, 7, 7] [4, 5, 3]Riemannian Adaptive Optimization Methods
- 7.00, 0.82, [8, 7, 6] [2, 3, 4]Detecting Egregious Responses in Neural Sequence-to-sequence Models
- 7.00, 0.82, [7, 8, 6] [5, 5, 3]Deep Learning 3D Shapes Using Alt-az Anisotropic 2-Sphere Convolution
- 7.00, 0.00, [7, 7, 7] [5, 2, 4]How Important is a Neuron
- 7.00, 0.00, [7, 7, 7] [3, 3, 3]Neural network gradient-based learning of black-box function interfaces
- 7.00, 0.82, [8, 6, 7] [4, 5, 4]Wizard of Wikipedia: Knowledge-Powered Conversational Agents
- 7.00, 2.16, [9, 4, 8] [4, 5, 5]Invariant and Equivariant Graph Networks
- 7.00, 1.41, [8, 8, 5] [3, 5, 5]The effects of neural resource constraints on early visual representations
- 7.00, 0.82, [8, 6, 7] [4, 2, 3]Recurrent Experience Replay in Distributed Reinforcement Learning
- 7.00, 1.41, [9, 6, 6] [3, 4, 4]Padam: Closing the Generalization Gap of Adaptive Gradient Methods in Training Deep Neural Networks
- 7.00, 0.82, [7, 6, 8] [3, 4, 2]Discriminator-Actor-Critic: Addressing Sample Inefficiency and Reward Bias in Adversarial Imitation Learning
- 7.00, 0.82, [8, 6, 7] [3, 4, 4]Quasi-hyperbolic momentum and Adam for deep learning
- 7.00, 0.00, [7, 7, 7] [3, 3, 3]The Comparative Power of ReLU Networks and Polynomial Kernels in the Presence of Sparse Latent Structure
- 7.00, 1.41, [8, 5, 8] [4, 2, 3]Learning Self-Imitating Diverse Policies
- 7.00, 1.41, [8, 5, 8] [4, 4, 5]Unsupervised Domain Adaptation for Distance Metric Learning
- 7.00, 0.00, [7, 7, 7] [3, 4, 4]Scalable Reversible Generative Models with Free-form Continuous Dynamics
- 7.00, 1.63, [7, 9, 5] [3, 4, 4]Feature Intertwiners
- 7.00, 0.00, [7, 7, 7] [4, 4, 4]Unsupervised Speech Recognition via Segmental Empirical Output Distribution Matching
- 7.00, 1.63, [7, 5, 9] [4, 3, 4]ClariNet: Parallel Wave Generation in End-to-End Text-to-Speech
- 7.00, 0.82, [6, 8, 7] [4, 4, 4]textTOvec: DEEP CONTEXTUALIZED NEURAL AUTOREGRESSIVE TOPIC MODELS OF LANGUAGE WITH DISTRIBUTED COMPOSITIONAL PRIOR
- 7.00, 1.41, [8, 5, 8] [4, 5, 5]Local SGD Converges Fast and Communicates Little
- 7.00, 0.00, [7, 7, 7] [3, 5, 3]The role of over-parametrization in generalization of neural networks
- 7.00, 1.63, [9, 5, 7] [5, 4, 4]The Neuro-Symbolic Concept Learner: Interpreting Scenes, Words, and Sentences From Natural Supervision
- 7.00, 1.41, [5, 8, 8] [3, 4, 3]Don’t Settle for Average, Go for the Max: Fuzzy Sets and Max-Pooled Word Vectors
- 7.00, 0.00, [7, 7, 7] [4, 4, 4]What do you learn from context? Probing for sentence structure in contextualized word representations
- 7.00, 0.82, [6, 7, 8] [4, 3, 4]Learning Implicitly Recurrent CNNs Through Parameter Sharing
- 7.00, 1.41, [8, 8, 5] [4, 4, 4]The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks
- 7.00, 0.82, [6, 8, 7] [3, 4, 4]Learning Neural PDE Solvers with Convergence Guarantees
- 7.00, 0.82, [8, 6, 7] [4, 4, 4]Lagging Inference Networks and Posterior Collapse in Variational Autoencoders
- 7.00, 0.00, [7, 7, 7] [3, 3, 3]Deep Online Learning Via Meta-Learning: Continual Adaptation for Model-Based RL
- 7.00, 0.00, [7, 7, 7] [5, 3, 3]Modeling Uncertainty with Hedged Instance Embeddings
- 7.00, 0.00, [7, 7, 7] [3, 3, 3]Learning to Navigate the Web
- 7.00, 0.00, [7, 7, 7] [4, 3, 2]G-SGD: Optimizing ReLU Neural Networks in its Positively Scale-Invariant Space
- 7.00, 0.82, [8, 7, 6] [3, 4, 3]GANSynth: Adversarial Neural Audio Synthesis
- 7.00, 0.82, [8, 6, 7] [4, 3, 5]K For The Price Of 1: Parameter Efficient Multi-task And Transfer Learning
- 7.00, 0.82, [8, 7, 6] [3, 2, 4]Learning sparse relational transition models
- 7.00, 0.82, [8, 6, 7] [4, 3, 4]Learning to Screen for Fast Softmax Inference on Large Vocabulary Neural Networks
- 7.00, 0.82, [8, 6, 7] [3, 3, 3]On the Universal Approximability and Complexity Bounds of Quantized ReLU Neural Networks
- 7.00, 1.41, [5, 8, 8] [3, 2, 2]Global-to-local Memory Pointer Networks for Task-Oriented Dialogue
- 6.80, 0.40, [6, 7, 7, 7, 7] [1, 3, 4, 3, 2]Subgradient Descent Learns Orthogonal Dictionaries
- 6.67, 1.25, [5, 7, 8] [3, 3, 2]Training for Faster Adversarial Robustness Verification via Inducing ReLU Stability
- 6.67, 0.47, [6, 7, 7] [3, 4, 2]RotDCF: Decomposition of Convolutional Filters for Rotation-Equivariant Deep Networks
- 6.67, 0.94, [8, 6, 6] [3, 3, 4]Principled Deep Neural Network Training through Linear Programming
- 6.67, 0.94, [8, 6, 6] [4, 4, 4]Directed-Info GAIL: Learning Hierarchical Policies from Unsegmented Demonstrations using Directed Information
- 6.67, 0.47, [7, 7, 6] [4, 4, 4]Analysis of Quantized Models
- 6.67, 0.94, [8, 6, 6] [5, 3, 4]Practical lossless compression with latent variables using bits back coding
- 6.67, 0.47, [6, 7, 7] [5, 4, 4]Sample Efficient Adaptive Text-to-Speech
- 6.67, 1.25, [7, 5, 8] [3, 4, 4]Visual Explanation by Interpretation: Improving Visual Feedback Capabilities of Deep Neural Networks
- 6.67, 1.89, [8, 4, 8] [5, 4, 4]LeMoNADe: Learned Motif and Neuronal Assembly Detection in calcium imaging videos
- 6.67, 0.47, [7, 7, 6] [3, 3, 2]Woulda, Coulda, Shoulda: Counterfactually-Guided Policy Search
- 6.67, 0.47, [6, 7, 7] [3, 3, 4]Towards the first adversarially robust neural network model on MNIST
- 6.67, 0.94, [6, 8, 6] [3, 4, 4]Off-Policy Evaluation and Learning from Logged Bandit Feedback: Error Reduction via Surrogate Policy
- 6.67, 0.47, [6, 7, 7] [3, 4, 4]Meta-Learning For Stochastic Gradient MCMC
- 6.67, 0.47, [7, 6, 7] [3, 3, 3]Trellis Networks for Sequence Modeling
- 6.67, 0.47, [6, 7, 7] [3, 3, 4]ADef: an Iterative Algorithm to Construct Adversarial Deformations
- 6.67, 0.47, [7, 6, 7] [3, 2, 4]Minimal Random Code Learning: Getting Bits Back from Compressed Model Parameters
- 6.67, 0.94, [6, 8, 6] [2, 5, 3]Learning to Schedule Communication in Multi-agent Reinforcement Learning
- 6.67, 0.47, [6, 7, 7] [3, 2, 3]Learning Factorized Multimodal Representations
- 6.67, 0.94, [8, 6, 6] [4, 5, 4]Non-vacuous Generalization Bounds at the ImageNet Scale: a PAC-Bayesian Compression Approach
- 6.67, 1.70, [6, 5, 9] [4, 2, 4]Deep Self-Organization: Interpretable Discrete Representation Learning on Time Series
- 6.67, 0.47, [7, 6, 7] [3, 4, 3]Adv-BNN: Improved Adversarial Defense through Robust Bayesian Neural Network
- 6.67, 0.47, [7, 7, 6] [4, 5, 5]Hyperbolic Attention Networks
- 6.67, 1.25, [7, 5, 8] [3, 3, 3]NADPEx: An on-policy temporally consistent exploration method for deep reinforcement learning
- 6.67, 0.47, [7, 7, 6] [4, 2, 3]Latent Convolutional Models
- 6.67, 1.25, [8, 7, 5] [4, 4, 3]Bounce and Learn: Modeling Scene Dynamics with Real-World Bounces
- 6.67, 0.47, [7, 7, 6] [4, 3, 4]Generalized Tensor Models for Recurrent Neural Networks
- 6.67, 0.94, [6, 8, 6] [2, 4, 4]Bayesian Prediction of Future Street Scenes using Synthetic Likelihoods
- 6.67, 0.94, [8, 6, 6] [3, 4, 4]Episodic Curiosity through Reachability
- 6.67, 0.47, [6, 7, 7] [3, 4, 4]Beyond Pixel Norm-Balls: Parametric Adversaries using an Analytically Differentiable Renderer
- 6.67, 0.47, [7, 6, 7] [3, 4, 4]Solving the Rubik’s Cube with Approximate Policy Iteration
- 6.67, 0.47, [7, 6, 7] [4, 5, 3]Query-Efficient Hard-label Black-box Attack: An Optimization-based Approach
- 6.67, 0.94, [6, 8, 6] [3, 4, 3]Unsupervised Learning via Meta-Learning
- 6.67, 0.94, [6, 6, 8] [2, 2, 2]Adaptivity of deep ReLU network for learning in Besov and mixed smooth Besov spaces: optimal rate and curse of dimensionality
- 6.67, 0.94, [6, 6, 8] [3, 2, 4]A Data-Driven and Distributed Approach to Sparse Signal Representation and Recovery
- 6.67, 0.47, [7, 6, 7] [4, 5, 4]Big-Little Net: An Efficient Multi-Scale Feature Representation for Visual and Speech Recognition
- 6.67, 1.25, [8, 5, 7] [4, 4, 4]No Training Required: Exploring Random Encoders for Sentence Classification
- 6.67, 0.47, [7, 6, 7] [5, 4, 4]Learning Two-layer Neural Networks with Symmetric Inputs
- 6.67, 0.47, [7, 7, 6] [4, 4, 4]Phase-Aware Speech Enhancement with Deep Complex U-Net
- 6.67, 1.25, [8, 5, 7] [1, 5, 4]Deep Layers as Stochastic Solvers
- 6.67, 0.47, [7, 6, 7] [4, 4, 4]Graph HyperNetworks for Neural Architecture Search
- 6.67, 1.25, [7, 8, 5] [4, 4, 4]Complement Objective Training
- 6.67, 0.47, [7, 6, 7] [4, 4, 3]Probabilistic Recursive Reasoning for Multi-Agent Reinforcement Learning
- 6.67, 0.94, [8, 6, 6] [4, 4, 4]Generative Question Answering: Learning to Answer the Whole Question
- 6.67, 1.70, [6, 9, 5] [3, 4, 4]Detecting Adversarial Examples Via Neural Fingerprinting
- 6.67, 1.25, [8, 5, 7] [4, 3, 1]Learning concise representations for regression by evolving networks of trees
- 6.67, 0.47, [6, 7, 7] [3, 4, 4]Optimal Completion Distillation for Sequence Learning
- 6.67, 1.70, [9, 5, 6] [4, 4, 4]AdaShift: Decorrelation and Convergence of Adaptive Learning Rate Methods
- 6.67, 0.47, [7, 6, 7] [4, 3, 1]Visual Semantic Navigation using Scene Priors
- 6.67, 0.47, [7, 7, 6] [3, 5, 4]Hierarchical RL Using an Ensemble of Proprioceptive Periodic Policies
- 6.67, 0.47, [6, 7, 7] [4, 5, 4]A Convergence Analysis of Gradient Descent for Deep Linear Neural Networks
- 6.67, 0.47, [6, 7, 7] [4, 4, 4]Adversarial Attacks on Graph Neural Networks via Meta Learning
- 6.67, 0.47, [6, 7, 7] [4, 5, 3]Doubly Reparameterized Gradient Estimators for Monte Carlo Objectives
- 6.67, 0.47, [7, 7, 6] [4, 4, 4]Approximating CNNs with Bag-of-local-Features models works surprisingly well on ImageNet
- 6.67, 0.47, [7, 7, 6] [5, 4, 4]Three Mechanisms of Weight Decay Regularization
- 6.67, 0.47, [6, 7, 7] [4, 2, 2]Theoretical Analysis of Auto Rate-Tuning by Batch Normalization
- 6.67, 0.94, [6, 8, 6] [3, 4, 4]Transferring Knowledge across Learning Processes
- 6.67, 1.70, [9, 5, 6] [3, 4, 1]Dimensionality Reduction for Representing the Knowledge of Probabilistic Models
- 6.67, 0.47, [7, 6, 7] [2, 3, 4]Defensive Quantization: When Efficiency Meets Robustness
- 6.67, 1.25, [7, 8, 5] [3, 4, 5]Learning To Solve Circuit-SAT: An Unsupervised Differentiable Approach
- 6.67, 0.47, [7, 6, 7] [4, 4, 5]FlowQA: Grasping Flow in History for Conversational Machine Comprehension
- 6.67, 0.94, [6, 6, 8] [4, 5, 4]Learning Grid-like Units with Vector Representation of Self-Position and Matrix Representation of Self-Motion
- 6.67, 1.25, [8, 5, 7] [4, 2, 1]GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
- 6.67, 0.47, [7, 7, 6] [3, 3, 2]Automatically Composing Representation Transformations as a Means for Generalization
- 6.67, 0.47, [6, 7, 7] [4, 4, 3]RotatE: Knowledge Graph Embedding by Relational Rotation in Complex Space
- 6.67, 1.25, [8, 7, 5] [4, 4, 3]Looking for ELMo’s friends: Sentence-Level Pretraining Beyond Language Modeling
- 6.67, 0.47, [7, 6, 7] [3, 1, 3]A Mean Field Theory of Batch Normalization
- 6.67, 1.25, [5, 7, 8] [3, 3, 4]Differentiable Perturb-and-Parse: Semi-Supervised Parsing with a Structured Variational Autoencoder
- 6.67, 0.47, [7, 6, 7] [4, 3, 4]Active Learning with Partial Feedback
- 6.67, 0.47, [7, 6, 7] [4, 5, 4]Learning from Incomplete Data with Generative Adversarial Networks
- 6.67, 0.47, [7, 6, 7] [3, 4, 4]Do Deep Generative Models Know What They Don’t Know?
- 6.67, 0.94, [6, 8, 6] [4, 4, 4]RelGAN: Relational Generative Adversarial Networks for Text Generation
- 6.67, 0.47, [7, 6, 7] [2, 2, 2]Provable Online Dictionary Learning and Sparse Coding
- 6.67, 0.94, [6, 6, 8] [4, 4, 4]Universal Stagewise Learning for Non-Convex Problems with Convergence on Averaged Solutions
- 6.67, 0.47, [7, 7, 6] [4, 5, 5]SPIGAN: Privileged Adversarial Learning from Simulation
- 6.67, 0.47, [7, 6, 7] [4, 3, 4]Disjoint Mapping Network for Cross-modal Matching of Voices and Faces
- 6.67, 0.47, [7, 7, 6] [4, 5, 4]Learning to Infer and Execute 3D Shape Programs
- 6.67, 1.89, [8, 4, 8] [4, 4, 4]A Generative Model For Electron Paths
- 6.67, 0.94, [6, 6, 8] [3, 2, 4]Stochastic Optimization of Sorting Networks via Continuous Relaxations
- 6.67, 0.47, [7, 6, 7] [2, 5, 4]Learning a Meta-Solver for Syntax-Guided Program Synthesis
- 6.67, 0.94, [6, 8, 6] [1, 4, 4]There Are Many Consistent Explanations of Unlabeled Data: Why You Should Average
- 6.67, 1.25, [7, 8, 5] [5, 4, 5]Learning to Learn without Forgetting By Maximizing Transfer and Minimizing Interference
- 6.50, 1.50, [4, 7, 7, 8] [3, 3, 2, 5]Deterministic PAC-Bayesian generalization bounds for deep networks via generalizing noise-resilience
- 6.33, 0.94, [7, 5, 7] [3, 5, 3]Stochastic Gradient Descent Learns State Equations with Nonlinear Activations
- 6.33, 0.47, [6, 6, 7] [4, 3, 3]Improved Gradient Estimators for Stochastic Discrete Variables
- 6.33, 1.70, [7, 4, 8] [3, 5, 5]Learning Preconditioner on Matrix Lie Group
- 6.33, 1.25, [8, 5, 6] [4, 4, 4]Post Selection Inference with Incomplete Maximum Mean Discrepancy Estimator
- 6.33, 0.47, [7, 6, 6] [5, 4, 2]Local Critic Training of Deep Neural Networks
- 6.33, 1.70, [4, 8, 7] [4, 4, 4]Are adversarial examples inevitable?
- 6.33, 1.25, [6, 8, 5] [4, 4, 4]Generating Multiple Objects at Spatially Distinct Locations
- 6.33, 0.47, [6, 6, 7] [4, 4, 3]DELTA: DEEP LEARNING TRANSFER USING FEATURE MAP WITH ATTENTION FOR CONVOLUTIONAL NETWORKS
- 6.33, 1.25, [6, 5, 8] [2, 5, 3]signSGD via Zeroth-Order Oracle
- 6.33, 0.47, [6, 7, 6] [2, 2, 4]Reward Constrained Policy Optimization
- 6.33, 1.25, [5, 6, 8] [5, 5, 4]Quaternion Recurrent Neural Networks
- 6.33, 0.94, [5, 7, 7] [3, 3, 4]DialogWAE: Multimodal Response Generation with Conditional Wasserstein Auto-Encoder
- 6.33, 1.89, [5, 5, 9] [4, 3, 5]Laplacian Networks: Bounding Indicator Function Smoothness for Neural Networks Robustness
- 6.33, 0.94, [5, 7, 7] [4, 4, 5]Why do deep convolutional networks generalize so poorly to small image transformations?
- 6.33, 1.25, [6, 8, 5] [4, 3, 3]Hierarchical Visuomotor Control of Humanoids
- 6.33, 0.94, [7, 5, 7] [4, 4, 4]Hindsight policy gradients
- 6.33, 0.47, [7, 6, 6] [4, 4, 4]Attentive Neural Processes
- 6.33, 0.94, [7, 5, 7] [5, 5, 4]ROBUST ESTIMATION VIA GENERATIVE ADVERSARIAL NETWORKS
- 6.33, 0.94, [7, 5, 7] [5, 4, 2]Execution-Guided Neural Program Synthesis
- 6.33, 0.47, [7, 6, 6] [4, 5, 3]Dynamically Unfolding Recurrent Restorer: A Moving Endpoint Control Method for Image Restoration
- 6.33, 1.25, [5, 8, 6] [4, 3, 3]Learning Recurrent Binary/Ternary Weights
- 6.33, 0.47, [7, 6, 6] [5, 5, 5]Attention, Learn to Solve Routing Problems!
- 6.33, 0.94, [5, 7, 7] [4, 3, 3]Improving Generalization and Stability of Generative Adversarial Networks
- 6.33, 0.47, [7, 6, 6] [5, 4, 4]Visceral Machines: Reinforcement Learning with Intrinsic Physiological Rewards
- 6.33, 0.47, [6, 6, 7] [4, 3, 3]Marginal Policy Gradients: A Unified Family of Estimators for Bounded Action Spaces with Applications
- 6.33, 0.94, [5, 7, 7] [4, 2, 3]L-Shapley and C-Shapley: Efficient Model Interpretation for Structured Data
- 6.33, 0.94, [7, 7, 5] [4, 3, 4]Deep reinforcement learning with relational inductive biases
- 6.33, 0.47, [6, 7, 6] [4, 4, 4]GO Gradient for Expectation-Based Objectives
- 6.33, 0.94, [7, 5, 7] [3, 4, 4]PATE-GAN: Generating Synthetic Data with Differential Privacy Guarantees
- 6.33, 0.94, [7, 5, 7] [2, 3, 4]Bias-Reduced Uncertainty Estimation for Deep Neural Classifiers
- 6.33, 1.25, [6, 8, 5] [5, 5, 4]Multi-Domain Adversarial Learning
- 6.33, 0.47, [6, 7, 6] [5, 5, 2]Improving MMD-GAN Training with Repulsive Loss Function
- 6.33, 0.47, [6, 6, 7] [4, 3, 4]FUNCTIONAL VARIATIONAL BAYESIAN NEURAL NETWORKS
- 6.33, 1.25, [5, 6, 8] [4, 4, 4]Autoencoder-based Music Translation
- 6.33, 1.25, [6, 5, 8] [3, 4, 5]Fluctuation-dissipation relations for stochastic gradient descent
- 6.33, 0.47, [7, 6, 6] [3, 4, 4]Adaptive Estimators Show Information Compression in Deep Neural Networks
- 6.33, 1.25, [5, 8, 6] [5, 4, 4]On the loss landscape of a class of deep neural networks with no bad local valleys
- 6.33, 0.47, [6, 7, 6] [4, 4, 3]Multilingual Neural Machine Translation with Knowledge Distillation
- 6.33, 0.94, [5, 7, 7] [3, 3, 3]Emergent Coordination Through Competition
- 6.33, 1.70, [7, 8, 4] [4, 5, 3]Knowledge Flow: Improve Upon Your Teachers
- 6.33, 0.94, [5, 7, 7] [4, 3, 3]Representation Degeneration Problem in Training Natural Language Generation Models
- 6.33, 0.47, [6, 7, 6] [4, 4, 4]SNAS: stochastic neural architecture search
- 6.33, 0.47, [6, 6, 7] [3, 4, 2]Understanding Composition of Word Embeddings via Tensor Decomposition
- 6.33, 1.25, [6, 5, 8] [4, 4, 5]Self-Monitoring Navigation Agent via Auxiliary Progress Estimation
- 6.33, 0.47, [6, 6, 7] [4, 4, 4]RNNs implicitly implement tensor-product representations
- 6.33, 0.94, [5, 7, 7] [2, 3, 2]STRUCTURED ADVERSARIAL ATTACK: TOWARDS GENERAL IMPLEMENTATION AND BETTER INTERPRETABILITY
- 6.33, 0.94, [7, 7, 5] [3, 4, 5]Learning deep representations by mutual information estimation and maximization
- 6.33, 0.47, [7, 6, 6] [3, 4, 3]Bayesian Policy Optimization for Model Uncertainty
- 6.33, 1.25, [6, 8, 5] [3, 5, 3]A NOVEL VARIATIONAL FAMILY FOR HIDDEN NON-LINEAR MARKOV MODELS
- 6.33, 0.47, [7, 6, 6] [5, 4, 3]From Hard to Soft: Understanding Deep Network Nonlinearities via Vector Quantization and Statistical Inference
- 6.33, 0.47, [6, 6, 7] [4, 3, 4]Discriminator Rejection Sampling
- 6.33, 0.47, [6, 7, 6] [5, 5, 5]AntisymmetricRNN: A Dynamical System View on Recurrent Neural Networks
- 6.33, 0.94, [7, 5, 7] [4, 3, 3]The Laplacian in RL: Learning Representations with Efficient Approximations
- 6.33, 0.47, [7, 6, 6] [4, 2, 4]On Computation and Generalization of Generative Adversarial Networks under Spectrum Control
- 6.33, 0.47, [7, 6, 6] [5, 3, 3]Learning Finite State Representations of Recurrent Policy Networks
- 6.33, 0.47, [7, 6, 6] [2, 3, 5]Analyzing Inverse Problems with Invertible Neural Networks
- 6.33, 0.94, [7, 5, 7] [4, 5, 4]On Self Modulation for Generative Adversarial Networks
- 6.33, 0.94, [5, 7, 7] [5, 3, 4]Harmonizing Maximum Likelihood with GANs for Multimodal Conditional Generation
- 6.33, 0.47, [7, 6, 6] [4, 2, 4]Universal Transformers
- 6.33, 0.47, [7, 6, 6] [5, 3, 4]Variational Autoencoders with Jointly Optimized Latent Dependency Structure
- 6.33, 1.25, [5, 6, 8] [4, 5, 4]Hierarchical Generative Modeling for Controllable Speech Synthesis
- 6.33, 0.47, [6, 6, 7] [3, 3, 3]Individualized Controlled Continuous Communication Model for Multiagent Cooperative and Competitive Tasks
- 6.33, 1.25, [5, 6, 8] [4, 2, 3]A Rotation and a Translation Suffice: Fooling CNNs with Simple Transformations
- 6.33, 0.47, [7, 6, 6] [5, 4, 4]Instance-aware Image-to-Image Translation
- 6.33, 1.70, [7, 8, 4] [3, 4, 4]The Deep Weight Prior
- 6.33, 1.70, [8, 4, 7] [4, 4, 4]Janossy Pooling: Learning Deep Permutation-Invariant Functions for Variable-Size Inputs
- 6.33, 1.89, [9, 5, 5] [5, 4, 4]From Language to Goals: Inverse Reinforcement Learning for Vision-Based Instruction Following
- 6.33, 0.47, [6, 7, 6] [4, 4, 4]Empirical Bounds on Linear Regions of Deep Rectifier Networks
- 6.33, 0.47, [7, 6, 6] [4, 4, 5]Multilingual Neural Machine Translation With Soft Decoupled Encoding
- 6.33, 0.47, [6, 7, 6] [3, 2, 3]On the Convergence of A Class of Adam-Type Algorithms for Non-Convex Optimization
- 6.33, 0.47, [6, 6, 7] [4, 4, 5]MAE: Mutual Posterior-Divergence Regularization for Variational AutoEncoders
- 6.33, 1.70, [7, 8, 4] [3, 4, 3]CAMOU: Learning Physical Vehicle Camouflages to Adversarially Attack Detectors in the Wild
- 6.33, 1.25, [5, 6, 8] [4, 3, 4]BNN+: Improved Binary Network Training
- 6.33, 1.70, [8, 7, 4] [3, 4, 5]Statistical Verification of Neural Networks
- 6.33, 1.25, [8, 5, 6] [4, 5, 4]Exemplar Guided Unsupervised Image-to-Image Translation with Semantic Consistency
- 6.33, 0.47, [6, 6, 7] [4, 4, 2]Stable Recurrent Models
- 6.33, 0.94, [7, 5, 7] [3, 5, 2]Learning Mixed-Curvature Representations in Product Spaces
- 6.33, 0.47, [6, 6, 7] [3, 3, 3]Generating Multi-Agent Trajectories using Programmatic Weak Supervision
- 6.33, 0.47, [7, 6, 6] [4, 4, 4]Building Dynamic Knowledge Graphs from Text using Machine Reading Comprehension
- 6.33, 2.05, [4, 6, 9] [4, 4, 4]BA-Net: Dense Bundle Adjustment Networks
- 6.33, 2.05, [4, 9, 6] [4, 4, 4]Variance Reduction for Reinforcement Learning in Input-Driven Environments
- 6.33, 1.89, [5, 9, 5] [4, 4, 4]Predicting the Generalization Gap in Deep Networks with Margin Distributions
- 6.33, 1.25, [6, 5, 8] [4, 5, 5]Unsupervised Control Through Non-Parametric Discriminative Rewards
- 6.33, 0.94, [7, 5, 7] [4, 5, 3]Information asymmetry in KL-regularized RL
- 6.33, 0.94, [7, 5, 7] [5, 5, 3]Diversity-Sensitive Conditional Generative Adversarial Networks
- 6.33, 0.94, [7, 5, 7] [3, 4, 3]The Unreasonable Effectiveness of (Zero) Initialization in Deep Residual Learning
- 6.33, 0.47, [6, 7, 6] [3, 4, 3]Preventing Posterior Collapse with delta-VAEs
- 6.33, 1.70, [8, 7, 4] [4, 4, 5]TimbreTron: A WaveNet(CycleGAN(CQT(Audio))) Pipeline for Musical Timbre Transfer
- 6.33, 0.47, [6, 7, 6] [5, 4, 4]Feature-Wise Bias Amplification
- 6.33, 1.25, [8, 5, 6] [5, 5, 3]Machine Translation With Weakly Paired Bilingual Documents
- 6.33, 0.94, [5, 7, 7] [4, 3, 3]Don’t let your Discriminator be fooled
- 6.33, 1.89, [5, 5, 9] [4, 3, 4]Diagnosing and Enhancing VAE Models
- 6.33, 0.47, [7, 6, 6] [5, 3, 3]Spherical CNNs on Unstructured Grids
- 6.33, 2.05, [6, 9, 4] [5, 4, 5]Toward Understanding the Impact of Staleness in Distributed Machine Learning
- 6.33, 0.94, [7, 5, 7] [2, 4, 3]On the Sensitivity of Adversarial Robustness to Input Data Distributions
- 6.33, 1.89, [9, 5, 5] [4, 4, 5]Reasoning About Physical Interactions with Object-Centric Models
- 6.33, 0.47, [6, 6, 7] [3, 4, 3]Multiple-Attribute Text Rewriting
- 6.33, 1.25, [6, 8, 5] [4, 4, 4]Neural Graph Evolution: Automatic Robot Design
- 6.33, 0.47, [6, 7, 6] [4, 3, 4]Energy-Constrained Compression for Deep Neural Networks via Weighted Sparse Projection and Layer Input Masking
- 6.33, 1.25, [8, 6, 5] [4, 5, 4]DyRep: Learning Representations over Dynamic Graphs
- 6.33, 0.47, [6, 6, 7] [4, 4, 5]Eidetic 3D LSTM: A Model for Video Prediction and Beyond
- 6.33, 1.70, [7, 4, 8] [3, 3, 5]Probabilistic Neural-Symbolic Models for Interpretable Visual Question Answering
- 6.33, 0.94, [5, 7, 7] [4, 2, 3]The Limitations of Adversarial Training and the Blind-Spot Attack
- 6.33, 0.47, [6, 6, 7] [4, 4, 3]Regularized Learning for Domain Adaptation under Label Shifts
- 6.25, 0.83, [7, 7, 5, 6] [4, 1, 4, 4]Towards Consistent Performance on Atari using Expert Demonstrations
- 6.25, 0.83, [5, 7, 7, 6] [3, 3, 4, 5]Learning Protein Structure with a Differentiable Simulator
- 6.25, 0.83, [7, 5, 7, 6] [4, 3, 3, 4]The Implicit Preference Information in an Initial State
- 6.25, 0.83, [7, 6, 7, 5] [5, 4, 4, 4]Competitive experience replay
- 6.25, 1.09, [8, 6, 6, 5] [3, 3, 2, 4]Efficiently testing local optimality and escaping saddles for ReLU networks
- 6.25, 1.92, [7, 3, 8, 7] [1, 4, 1, 4]DISTRIBUTIONAL CONCAVITY REGULARIZATION FOR GANS
- 6.00, 0.82, [7, 6, 5] [3, 4, 3]Invariance and Inverse Stability under ReLU
- 6.00, 0.82, [5, 7, 6] [3, 4, 5]Precision Highway for Ultra Low-precision Quantization
- 6.00, 0.82, [7, 5, 6] [4, 3, 5]Large Scale Graph Learning From Smooth Signals
- 6.00, 0.82, [5, 6, 7] [3, 4, 4]L2-Nonexpansive Neural Networks
- 6.00, 0.82, [6, 7, 5] [4, 3, 4]Adversarial Imitation via Variational Inverse Reinforcement Learning
- 6.00, 1.41, [7, 4, 7] [3, 4, 3]Monge-Amp\`ere Flow for Generative Modeling
- 6.00, 0.00, [6, 6, 6] [3, 4, 3]INVASE: Instance-wise Variable Selection using Neural Networks
- 6.00, 0.82, [6, 5, 7] [4, 5, 4]DPSNet: End-to-end Deep Plane Sweep Stereo
- 6.00, 2.45, [6, 3, 9] [3, 4, 2]SUPERVISED POLICY UPDATE
- 6.00, 0.00, [6, 6, 6] [4, 5, 4]DATNet: Dual Adversarial Transfer for Low-resource Named Entity Recognition
- 6.00, 2.16, [8, 3, 7] [3, 4, 4]A rotation-equivariant convolutional neural network model of primary visual cortex
- 6.00, 1.41, [7, 7, 4] [4, 4, 4]ANYTIME MINIBATCH: EXPLOITING STRAGGLERS IN ONLINE DISTRIBUTED OPTIMIZATION
- 6.00, 0.71, [6, 6, 7, 5] [2, 3, 2, 4]Maximal Divergence Sequential Autoencoder for Binary Software Vulnerability Detection
- 6.00, 0.00, [6, 6, 6] [3, 3, 3]Semi-supervised Learning with Multi-Domain Sentiment Word Embeddings
- 6.00, 0.00, [6, 6, 6] [4, 4, 3]Variance Networks: When Expectation Does Not Meet Your Expectations
- 6.00, 0.82, [7, 5, 6] [4, 4, 4]Language Modeling Teaches You More Syntax than Translation Does: Lessons Learned Through Auxiliary Task Analysis
- 6.00, 0.82, [6, 7, 5] [4, 4, 3]On Tighter Generalization Bounds for Deep Neural Networks: CNNs, ResNets, and Beyond
- 6.00, 1.63, [4, 6, 8] [4, 3, 5]Formal Limitations on the Measurement of Mutual Information
- 6.00, 0.00, [6, 6, 6] [4, 5, 3]Feed-forward Propagation in Probabilistic Neural Networks with Categorical and Max Layers
- 6.00, 0.82, [7, 5, 6] [3, 5, 4]Dirichlet Variational Autoencoder
- 6.00, 1.41, [8, 5, 5] [4, 2, 4]Learning Kolmogorov Models for Binary Random Variables
- 6.00, 1.41, [4, 7, 7] [3, 5, 4]Coarse-grain Fine-grain Coattention Network for Multi-evidence Question Answering
- 6.00, 1.63, [8, 4, 6] [3, 5, 3]Are Generative Classifiers More Robust to Adversarial Attacks?
- 6.00, 0.82, [7, 6, 5] [3, 3, 4]EFFICIENT TWO-STEP ADVERSARIAL DEFENSE FOR DEEP NEURAL NETWORKS
- 6.00, 0.82, [5, 6, 7] [4, 4, 3]POLICY GENERALIZATION IN CAPACITY-LIMITED REINFORCEMENT LEARNING
- 6.00, 1.87, [5, 9, 4, 6] [5, 4, 5, 4]Adversarial Vulnerability of Neural Networks Increases with Input Dimension
- 6.00, 1.41, [7, 7, 4] [2, 3, 4]GamePad: A Learning Environment for Theorem Proving
- 6.00, 1.41, [7, 4, 7] [3, 5, 4]The Singular Values of Convolutional Layers
- 6.00, 1.41, [5, 8, 5] [4, 4, 4]code2seq: Generating Sequences from Structured Representations of Code
- 6.00, 1.00, [5, 7] [5, 4]PeerNets: Exploiting Peer Wisdom Against Adversarial Attacks
- 6.00, 1.63, [8, 4, 6] [3, 4, 4]Manifold Mixup: Learning Better Representations by Interpolating Hidden States
- 6.00, 0.82, [7, 5, 6] [5, 5, 3]Temporal Gaussian Mixture Layer for Videos
- 6.00, 0.82, [7, 5, 6] [4, 4, 5]Neural Speed Reading with Structural-Jump-LSTM
- 6.00, 0.82, [6, 7, 5] [4, 3, 3]Information Theoretic lower bounds on negative log likelihood
- 6.00, 0.71, [7, 6, 6, 5] [3, 3, 3, 4]Sinkhorn AutoEncoders
- 6.00, 0.00, [6, 6, 6] [4, 4, 2]Neural Networks for Modeling Source Code Edits
- 6.00, 0.82, [6, 5, 7] [4, 3, 4]LayoutGAN: Generating Graphic Layouts with Wireframe Discriminator
- 6.00, 0.82, [7, 5, 6] [4, 5, 4]SGD Converges to Global Minimum in Deep Learning via Star-convex Path
- 6.00, 0.82, [5, 6, 7] [4, 4, 2]Learning from Positive and Unlabeled Data with a Selection Bias
- 6.00, 0.82, [5, 6, 7] [4, 3, 3]Aggregated Momentum: Stability Through Passive Damping
- 6.00, 1.63, [6, 8, 4] [4, 4, 4]ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness.
- 6.00, 0.00, [6, 6, 6] [4, 4, 3]Countering Language Drift via Grounding
- 6.00, 0.00, [6, 6, 6] [4, 4, 4]Measuring Compositionality in Representation Learning
- 6.00, 2.16, [9, 5, 4] [5, 4, 4]A Biologically Inspired Visual Working Memory for Deep Networks
- 6.00, 0.82, [6, 5, 7] [4, 2, 3]Universal Successor Features Approximators
- 6.00, 1.41, [5, 8, 5] [4, 3, 5]Deep Convolutional Networks as shallow Gaussian Processes
- 6.00, 0.00, [6, 6, 6] [4, 3, 4]Accumulation Bit-Width Scaling For Ultra-Low Precision Training Of Deep Networks
- 6.00, 0.82, [7, 5, 6] [1, 3, 3]Variational Bayesian Phylogenetic Inference
- 6.00, 0.71, [5, 6, 6, 7] [4, 3, 3, 3]Relational Forward Models for Multi-Agent Learning
- 6.00, 0.82, [7, 5, 6] [4, 5, 3]Generative predecessor models for sample-efficient imitation learning
- 6.00, 0.82, [5, 6, 7] [5, 5, 3]Optimistic mirror descent in saddle-point problems: Going the extra(-gradient) mile
- 6.00, 0.00, [6, 6, 6] [1, 2, 4]Stable Opponent Shaping in Differentiable Games
- 6.00, 0.82, [7, 6, 5] [4, 4, 4]DeepOBS: A Deep Learning Optimizer Benchmark Suite
- 6.00, 0.82, [6, 7, 5] [4, 4, 4]Policy Transfer with Strategy Optimization
- 6.00, 1.41, [4, 7, 7] [4, 4, 4]Direct Optimization through $\arg \max$ for Discrete Variational Auto-Encoder
- 6.00, 0.82, [7, 6, 5] [2, 4, 3]Graph Convolutional Network with Sequential Attention For Goal-Oriented Dialogue Systems
- 6.00, 1.63, [8, 4, 6] [3, 4, 3]Integer Networks for Data Compression with Latent-Variable Models
- 6.00, 0.82, [6, 5, 7] [3, 3, 5]Residual Non-local Attention Networks for Image Restoration
- 6.00, 0.00, [6, 6, 6] [4, 3, 4]Information-Directed Exploration for Deep Reinforcement Learning
- 6.00, 0.82, [5, 7, 6] [5, 4, 4]Von Mises-Fisher Loss for Training Sequence to Sequence Models with Continuous Outputs
- 6.00, 1.87, [7, 8, 6, 3] [4, 4, 4, 5]Gradient Descent Provably Optimizes Over-parameterized Neural Networks
- 6.00, 1.41, [4, 7, 7] [4, 3, 3]Transfer Learning for Related Reinforcement Learning Tasks via Image-to-Image Translation
- 6.00, 1.22, [4, 7, 6, 7] [5, 4, 3, 5]Dynamic Channel Pruning: Feature Boosting and Suppression
- 6.00, 0.82, [7, 6, 5] [3, 4, 3]Unsupervised Hyper-alignment for Multilingual Word Embeddings
- 6.00, 0.00, [6, 6, 6] [3, 4, 5]GraphSeq2Seq: Graph-Sequence-to-Sequence for Neural Machine Translation
- 6.00, 0.82, [5, 7, 6] [4, 4, 3]Multi-class classification without multi-class labels
- 6.00, 1.63, [8, 6, 4] [3, 3, 4]On the Relation Between the Sharpest Directions of DNN Loss and the SGD Step Length
- 6.00, 0.00, [6, 6, 6] [4, 3, 4]Learning Disentangled Representations with Reference-Based Variational Autoencoders
- 6.00, 1.41, [7, 4, 7] [3, 5, 4]Deep Lagrangian Networks: Using Physics as Model Prior for Deep Learning
- 6.00, 1.41, [7, 4, 7] [3, 3, 4]AutoLoss: Learning Discrete Schedule for Alternate Optimization
- 6.00, 0.82, [6, 7, 5] [4, 4, 4]Aligning Artificial Neural Networks to the Brain yields Shallow Recurrent Architectures
- 6.00, 0.00, [6, 6, 6] [4, 4, 4]Adversarial Information Factorization
- 6.00, 2.16, [7, 3, 8] [3, 4, 4]ARM: Augment-REINFORCE-Merge Gradient for Stochastic Binary Networks
- 6.00, 0.00, [6, 6, 6] [4, 5, 4]BabyAI: First Steps Towards Grounded Language Learning With a Human In the Loop
- 6.00, 1.87, [6, 9, 5, 4] [3, 4, 3, 5]Fortified Networks: Improving the Robustness of Deep Networks by Modeling the Manifold of Hidden Representations
- 6.00, 0.82, [5, 7, 6] [4, 4, 4]Hierarchical Reinforcement Learning with Limited Policies and Hindsight
- 6.00, 2.16, [5, 9, 4] [4, 4, 4]Backpropamine: training self-modifying neural networks with differentiable neuromodulated plasticity
- 6.00, 2.16, [9, 4, 5] [5, 4, 4]Detecting Memorization in ReLU Networks
- 6.00, 1.63, [6, 4, 8] [4, 4, 3]DADAM: A consensus-based distributed adaptive gradient method for online optimization
- 6.00, 1.63, [4, 6, 8] [4, 3, 4]A Systematic Study of Binary Neural Networks’ Optimisation
- 6.00, 1.41, [7, 4, 7] [4, 4, 5]Graph U-Net
- 6.00, 0.82, [7, 6, 5] [4, 3, 3]LEARNING TO PROPAGATE LABELS: TRANSDUCTIVE PROPAGATION NETWORK FOR FEW-SHOT LEARNING
- 6.00, 1.41, [5, 8, 5] [3, 4, 3]On the Computational Inefficiency of Large Batch Sizes for Stochastic Gradient Descent
- 6.00, 1.41, [8, 5, 5] [4, 4, 4]Detecting Out-Of-Distribution Samples Using Low-Order Deep Features Statistics
- 6.00, 0.82, [5, 7, 6] [4, 4, 4]Decoupled Weight Decay Regularization
- 6.00, 0.00, [6, 6, 6] [5, 4, 5]Diversity and Depth in Per-Example Routing Models
- 6.00, 1.41, [5, 5, 8] [4, 4, 4]ProxQuant: Quantized Neural Networks via Proximal Operators
- 6.00, 0.00, [6, 6, 6] [4, 3, 4]Wasserstein Barycenter Model Ensembling
- 6.00, 0.00, [6, 6, 6] [3, 4, 3]Stochastic Gradient Push for Distributed Deep Learning
- 6.00, 0.82, [5, 7, 6] [3, 1, 3]DOM-Q-NET: Grounded RL on Structured Language
- 6.00, 1.41, [8, 5, 5] [5, 3, 5]Meta-Learning with Latent Embedding Optimization
- 6.00, 0.00, [6, 6, 6] [3, 3, 4]Reinforcement Learning with Perturbed Rewards
- 6.00, 0.82, [5, 6, 7] [3, 3, 3]MEAN-FIELD ANALYSIS OF BATCH NORMALIZATION
- 6.00, 1.63, [8, 4, 6] [3, 4, 3]Learning what and where to attend with humans in the loop
- 6.00, 0.82, [7, 6, 5] [4, 5, 3]How to train your MAML
- 6.00, 0.82, [7, 6, 5] [4, 4, 3]Learning Heuristics for Automated Reasoning through Reinforcement Learning
- 6.00, 1.63, [4, 8, 6] [4, 3, 2]Lyapunov-based Safe Policy Optimization
- 6.00, 0.00, [6, 6, 6] [4, 3, 3]Dimension-Free Bounds for Low-Precision Training
- 6.00, 1.41, [4, 7, 7] [3, 4, 4]Overcoming the Disentanglement vs Reconstruction Trade-off via Jacobian Supervision
- 6.00, 0.00, [6, 6, 6] [4, 4, 4]Minimal Images in Deep Neural Networks: Fragile Object Recognition in Natural Images
- 6.00, 1.63, [4, 8, 6] [3, 4, 3]Unsupervised Adversarial Image Reconstruction
- 6.00, 0.00, [6, 6, 6] [4, 2, 3]Environment Probing Interaction Policies
- 6.00, 0.82, [5, 7, 6] [5, 2, 3]Neural Logic Machines
- 6.00, 0.00, [6, 6, 6] [3, 5, 5]Graph Transformer
- 6.00, 1.41, [5, 8, 5] [3, 2, 5]Prior Convictions: Black-box Adversarial Attacks with Bandits and Priors
- 6.00, 0.82, [5, 6, 7] [3, 3, 4]Self-Tuning Networks: Bilevel Optimization of Hyperparameters using Structured Best-Response Functions
- 6.00, 0.00, [6, 6, 6] [2, 4, 3]Improving the Generalization of Adversarial Training with Domain Adaptation
- 6.00, 1.63, [8, 6, 4] [2, 4, 4]Learning Abstract Models for Long-Horizon Exploration
- 6.00, 0.82, [6, 7, 5] [3, 3, 4]A Direct Approach to Robust Deep Learning Using Adversarial Networks
- 6.00, 0.82, [5, 7, 6] [4, 4, 3]Spreading vectors for similarity search
- 6.00, 1.63, [4, 6, 8] [4, 4, 4]Probabilistic Planning with Sequential Monte Carlo
- 6.00, 0.82, [6, 7, 5] [3, 3, 2]Recall Traces: Backtracking Models for Efficient Reinforcement Learning
- 6.00, 0.82, [6, 5, 7] [3, 3, 3]Value Propagation Networks
- 6.00, 0.00, [6, 6, 6] [4, 5, 2]A Closer Look at Few-shot Classification
- 6.00, 1.63, [4, 8, 6] [5, 3, 3]Learning to Learn with Conditional Class Dependencies
- 6.00, 0.00, [6, 6, 6] [5, 4, 5]TarMAC: Targeted Multi-Agent Communication
- 6.00, 0.82, [5, 6, 7] [5, 4, 3]A Differentiable Self-disambiguated Sense Embedding Model via Scaled Gumbel Softmax
- 6.00, 0.00, [6, 6, 6] [3, 3, 3]A MAX-AFFINE SPLINE PERSPECTIVE OF RECURRENT NEURAL NETWORKS
- 6.00, 0.00, [6, 6, 6] [3, 3, 3]Rigorous Agent Evaluation: An Adversarial Approach to Uncover Catastrophic Failures
- 6.00, 0.82, [7, 5, 6] [4, 4, 4]Diverse Machine Translation with a Single Multinomial Latent Variable
- 6.00, 0.00, [6, 6, 6] [2, 4, 4]Algorithmic Framework for Model-based Deep Reinforcement Learning with Theoretical Guarantees
- 6.00, 0.00, [6, 6, 6] [3, 3, 3]Characterizing Audio Adversarial Examples Using Temporal Dependency
- 6.00, 1.63, [8, 6, 4] [5, 4, 5]Adaptive Mixture of Low-Rank Factorizations for Compact Neural Modeling
- 6.00, 0.82, [6, 7, 5] [2, 2, 5]The Variational Deficiency Bottleneck
- 6.00, 0.82, [7, 6, 5] [4, 4, 4]Combinatorial Attacks on Binarized Neural Networks
- 6.00, 0.82, [5, 7, 6] [4, 2, 3]Contingency-Aware Exploration in Reinforcement Learning
- 6.00, 0.82, [7, 5, 6] [5, 4, 5]Model-Predictive Policy Learning with Uncertainty Regularization for Driving in Dense Traffic
- 6.00, 0.00, [6, 6, 6] [2, 2, 4]Proxy-less Architecture Search via Binarized Path Learning
- 6.00, 1.41, [4, 7, 7] [4, 4, 4]Revealing interpretable object representations from human behavior
- 6.00, 0.00, [6, 6, 6] [4, 4, 5]Multi-step Reasoning for Open-domain Question Answering
- 6.00, 0.00, [6, 6, 6] [3, 3, 4]Single Shot Neural Architecture Search Via Direct Sparse Optimization
- 5.75, 0.83, [7, 5, 6, 5] [3, 3, 3, 4]On the Spectral Bias of Neural Networks
- 5.75, 0.83, [5, 7, 6, 5] [3, 4, 3, 3]Modeling Parts, Structure, and System Dynamics via Predictive Learning
- 5.75, 0.83, [5, 6, 5, 7] [4, 4, 5, 3]An Alarm System for Segmentation Algorithm Based on Shape Model
- 5.75, 0.43, [6, 5, 6, 6] [4, 4, 3, 4]Two-Timescale Networks for Nonlinear Value Function Approximation
- 5.67, 0.94, [7, 5, 5] [5, 5, 4](Unconstrained) Beam Search is Sensitive to Large Search Discrepancies
- 5.67, 1.25, [7, 6, 4] [4, 1, 4]CONTROLLING COVARIATE SHIFT USING EQUILIBRIUM NORMALIZATION OF WEIGHTS
- 5.67, 0.47, [5, 6, 6] [4, 3, 3]Amortized Context Vector Inference for Sequence-to-Sequence Networks
- 5.67, 0.94, [5, 5, 7] [4, 5, 4]The meaning of “most” for visual question answering models
- 5.67, 2.05, [8, 3, 6] [4, 2, 3]Per-Tensor Fixed-Point Quantization of the Back-Propagation Algorithm
- 5.67, 0.94, [5, 7, 5] [4, 4, 3]A unified theory of adaptive stochastic gradient descent as Bayesian filtering
- 5.67, 0.47, [5, 6, 6] [4, 4, 4]Laplacian Smoothing Gradient Descent
- 5.67, 1.25, [4, 7, 6] [4, 4, 4]Explicit Information Placement on Latent Variables using Auxiliary Generative Modelling Task
- 5.67, 1.70, [4, 5, 8] [4, 4, 4]Discriminative Active Learning
- 5.67, 1.25, [4, 7, 6] [4, 4, 3]A Resizable Mini-batch Gradient Descent based on a Multi-Armed Bandit
- 5.67, 1.25, [4, 6, 7] [4, 4, 3]Generating Liquid Simulations with Deformation-aware Neural Networks
- 5.67, 0.94, [5, 7, 5] [4, 2, 5]A Kernel Random Matrix-Based Approach for Sparse PCA
- 5.67, 0.47, [6, 5, 6] [4, 4, 4]Identifying Generalization Properties in Neural Networks
- 5.67, 0.94, [5, 5, 7] [4, 4, 3]Hierarchical interpretations for neural network predictions
- 5.67, 0.47, [6, 5, 6] [4, 1, 3]Improved Learning of One-hidden-layer Convolutional Neural Networks with Overlaps
- 5.67, 0.47, [6, 5, 6] [1, 3, 4]M^3RL: Mind-aware Multi-agent Management Reinforcement Learning
- 5.67, 0.47, [6, 6, 5] [4, 4, 2]Gradient-based Training of Slow Feature Analysis by Differentiable Approximate Whitening
- 5.67, 0.47, [6, 5, 6] [3, 3, 3]Remember and Forget for Experience Replay
- 5.67, 0.94, [5, 5, 7] [4, 4, 4]Fast adversarial training for semi-supervised learning
- 5.67, 0.47, [6, 6, 5] [4, 3, 3]An Information-Theoretic Metric of Transferability for Task Transfer Learning
- 5.67, 1.25, [6, 4, 7] [4, 4, 4]Convolutional CRFs for Semantic Segmentation
- 5.67, 0.47, [6, 6, 5] [3, 5, 3]Dynamic Early Terminating of Multiply Accumulate Operations for Saving Computation Cost in Convolutional Neural Networks
- 5.67, 1.25, [4, 6, 7] [4, 4, 2]Causal importance of orientation selectivity for generalization in image recognition
- 5.67, 0.94, [5, 7, 5] [3, 3, 4]Function Space Particle Optimization for Bayesian Neural Networks
- 5.67, 0.47, [6, 5, 6] [4, 4, 4]Max-MIG: an Information Theoretic Approach for Joint Learning from Crowds
- 5.67, 1.25, [4, 7, 6] [4, 5, 4]Visual Reasoning by Progressive Module Networks
- 5.67, 0.47, [6, 6, 5] [3, 4, 3]Incremental training of multi-generative adversarial networks
- 5.67, 0.47, [6, 5, 6] [4, 3, 4]Projective Subspace Networks For Few Shot Learning
- 5.67, 0.94, [5, 7, 5] [4, 4, 3]DANA: Scalable Out-of-the-box Distributed ASGD Without Retuning
- 5.67, 1.25, [6, 7, 4] [4, 5, 4]A Closer Look at Deep Learning Heuristics: Learning rate restarts, Warmup and Distillation
- 5.67, 1.25, [4, 7, 6] [4, 3, 4]Adaptive Posterior Learning: few-shot learning with a surprise-based memory module
- 5.67, 0.94, [5, 7, 5] [4, 4, 4]Cramer-Wold AutoEncoder
- 5.67, 0.47, [6, 6, 5] [5, 3, 4]Better Generalization with On-the-fly Dataset Denoising
- 5.67, 1.25, [4, 7, 6] [3, 4, 4]Talk The Walk: Navigating Grids in New York City through Grounded Dialogue
- 5.67, 0.47, [5, 6, 6] [4, 4, 4]Efficient Lifelong Learning with A-GEM
- 5.67, 0.94, [5, 5, 7] [3, 3, 5]Optimal Transport Maps For Distribution Preserving Operations on Latent Spaces of Generative Models
- 5.67, 0.94, [5, 5, 7] [4, 4, 3]Learning Implicit Generative Models by Teaching Explicit Ones
- 5.67, 1.25, [4, 7, 6] [5, 4, 3]PPD: Permutation Phase Defense Against Adversarial Examples in Deep Learning
- 5.67, 2.36, [4, 9, 4] [2, 3, 4]PPO-CMA: Proximal Policy Optimization with Covariance Matrix Adaptation
- 5.67, 0.47, [5, 6, 6] [5, 5, 4]State-Regularized Recurrent Networks
- 5.67, 2.36, [9, 4, 4] [4, 4, 3]The Problem of Model Completion
- 5.67, 0.47, [6, 5, 6] [4, 5, 4]Zero-Resource Multilingual Model Transfer: Learning What to Share
- 5.67, 0.94, [7, 5, 5] [4, 3, 3]Learning to Make Analogies by Contrasting Abstract Relational Structure
- 5.67, 0.47, [6, 6, 5] [2, 5, 3]Towards Understanding Regularization in Batch Normalization
- 5.67, 1.25, [6, 7, 4] [4, 4, 4]ACCELERATING NONCONVEX LEARNING VIA REPLICA EXCHANGE LANGEVIN DIFFUSION
- 5.67, 0.47, [6, 6, 5] [2, 5, 4]Identifying Bias in AI using Simulation
- 5.67, 0.47, [6, 5, 6] [3, 4, 4]Understanding GANs via Generalization Analysis for Disconnected Support
- 5.67, 0.47, [6, 5, 6] [4, 3, 3]Deep Denoising: Rate-Optimal Recovery of Structured Signals with a Deep Prior
- 5.67, 1.25, [7, 4, 6] [3, 3, 4]Guiding Physical Intuition with Neural Stethoscopes
- 5.67, 0.94, [5, 7, 5] [4, 4, 5]Whitening and Coloring transform for GANs
- 5.67, 0.47, [5, 6, 6] [3, 4, 5]Efficient Codebook and Factorization for Second Order Representation Learning
- 5.67, 0.47, [6, 6, 5] [4, 3, 5]Adversarial Attacks on Node Embeddings
- 5.67, 0.47, [6, 6, 5] [4, 4, 4]Minimum Divergence vs. Maximum Margin: an Empirical Comparison on Seq2Seq Models
- 5.67, 0.47, [5, 6, 6] [3, 2, 3]Learning Neural Random Fields with Inclusive Auxiliary Generators
- 5.67, 0.47, [6, 6, 5] [4, 3, 3]Analysing Mathematical Reasoning Abilities of Neural Models
- 5.67, 0.47, [6, 5, 6] [4, 3, 4]Learning Representations of Sets through Optimized Permutations
- 5.67, 0.47, [6, 5, 6] [3, 4, 4]CBOW Is Not All You Need: Combining CBOW with the Compositional Matrix Space Model
- 5.67, 0.94, [5, 7, 5] [5, 4, 3]Backprop with Approximate Activations for Memory-efficient Network Training
- 5.67, 0.94, [5, 5, 7] [3, 4, 4]Learning models for visual 3D localization with implicit mapping
- 5.67, 1.25, [4, 7, 6] [5, 4, 4]Estimating Information Flow in DNNs
- 5.67, 0.94, [5, 5, 7] [3, 3, 3]Adversarial Exploration Strategy for Self-Supervised Imitation Learning
- 5.67, 0.94, [7, 5, 5] [4, 5, 5]signSGD with Majority Vote is Communication Efficient and Byzantine Fault Tolerant
- 5.67, 0.94, [7, 5, 5] [3, 3, 3]Predicted Variables in Programming
- 5.67, 0.47, [5, 6, 6] [5, 4, 3]Stochastic Adversarial Video Prediction
- 5.67, 1.70, [4, 5, 8] [4, 4, 3]Cross-Entropy Loss Leads To Poor Margins
- 5.67, 0.47, [6, 6, 5] [4, 4, 1]Kernel Recurrent Learning (KeRL)
- 5.67, 1.25, [6, 4, 7] [5, 4, 4]Learning Cross-Lingual Sentence Representations via a Multi-task Dual-Encoder Model
- 5.67, 0.47, [6, 5, 6] [4, 5, 2]Overcoming Multi-model Forgetting
- 5.67, 0.94, [7, 5, 5] [4, 4, 4]ADAPTIVE NETWORK SPARSIFICATION VIA DEPENDENT VARIATIONAL BETA-BERNOULLI DROPOUT
- 5.67, 0.94, [5, 5, 7] [4, 5, 5]Domain Adaptation for Structured Output via Disentangled Patch Representations
- 5.67, 0.47, [6, 6, 5] [5, 2, 4]Large-Scale Answerer in Questioner’s Mind for Visual Dialog Question Generation
- 5.67, 1.25, [6, 4, 7] [4, 4, 3]Reliable Uncertainty Estimates in Deep Neural Networks using Noise Contrastive Priors
- 5.67, 1.25, [6, 4, 7] [2, 4, 4]Excessive Invariance Causes Adversarial Vulnerability
- 5.67, 0.47, [6, 6, 5] [4, 3, 4]Adversarial Audio Synthesis
- 5.67, 0.94, [5, 7, 5] [3, 3, 3]Spectral Inference Networks: Unifying Deep and Spectral Learning
- 5.67, 2.49, [9, 3, 5] [4, 4, 4]Unsupervised Neural Multi-Document Abstractive Summarization of Reviews
- 5.67, 1.25, [6, 4, 7] [4, 5, 4]Learning Multimodal Graph-to-Graph Translation for Molecule Optimization
- 5.67, 0.47, [6, 5, 6] [3, 4, 4]Discovery of natural language concepts in individual units
- 5.67, 0.94, [5, 5, 7] [4, 4, 4]Unsupervised Learning of Sentence Representations Using Sequence Consistency
- 5.67, 0.94, [5, 7, 5] [4, 4, 3]Improving Sequence-to-Sequence Learning via Optimal Transport
- 5.67, 1.25, [6, 4, 7] [5, 4, 3]MILE: A Multi-Level Framework for Scalable Graph Embedding
- 5.67, 1.25, [6, 4, 7] [4, 3, 3]Learning to Represent Edits
- 5.67, 0.47, [6, 6, 5] [4, 3, 3]Out-of-Sample Extrapolation with Neuron Editing
- 5.67, 0.94, [5, 5, 7] [4, 5, 4]Improving Sentence Representations with Multi-view Frameworks
- 5.67, 0.47, [6, 5, 6] [4, 3, 5]Generalizable Adversarial Training via Spectral Normalization
- 5.67, 1.89, [3, 7, 7] [4, 4, 3]Learning Entropic Wasserstein Embeddings
- 5.67, 0.47, [5, 6, 6] [2, 1, 3]Emerging Disentanglement in Auto-Encoder Based Unsupervised Image Content Transfer
- 5.67, 0.47, [5, 6, 6] [5, 4, 4]Seq2Slate: Re-ranking and Slate Optimization with RNNs
- 5.67, 0.47, [5, 6, 6] [4, 4, 3]Data-Dependent Coresets for Compressing Neural Networks with Applications to Generalization Bounds
- 5.67, 0.94, [7, 5, 5] [5, 3, 3]A new dog learns old tricks: RL finds classic optimization algorithms
- 5.67, 1.25, [4, 6, 7] [3, 3, 3]Variational Autoencoder with Arbitrary Conditioning
- 5.67, 0.47, [5, 6, 6] [5, 4, 4]Neural Program Repair by Jointly Learning to Localize and Repair
- 5.67, 0.94, [7, 5, 5] [4, 4, 4]Shallow Learning For Deep Networks
- 5.67, 1.25, [4, 7, 6] [4, 4, 2]Alignment Based Mathching Networks for One-Shot Classification and Open-Set Recognition
- 5.67, 0.47, [6, 5, 6] [5, 4, 5]Deep Probabilistic Video Compression
- 5.67, 0.47, [6, 6, 5] [3, 4, 5]A More Globally Accurate Dimensionality Reduction Method Using Triplets
- 5.67, 1.25, [6, 4, 7] [4, 5, 4]Adaptive Gradient Methods with Dynamic Bound of Learning Rate
- 5.67, 0.47, [6, 5, 6] [2, 4, 1]Adversarially Learned Mixture Model
- 5.67, 1.25, [4, 7, 6] [4, 2, 2]Clean-Label Backdoor Attacks
- 5.67, 1.25, [7, 4, 6] [2, 4, 4]Perception-Aware Point-Based Value Iteration for Partially Observable Markov Decision Processes
- 5.67, 0.47, [5, 6, 6] [4, 3, 4]Trace-back along capsules and its application on semantic segmentation
- 5.67, 1.25, [7, 4, 6] [4, 4, 5]Hallucinations in Neural Machine Translation
- 5.67, 0.47, [5, 6, 6] [3, 1, 5]Learning Programmatically Structured Representations with Perceptor Gradients
- 5.67, 1.89, [7, 7, 3] [4, 5, 5]Learning Exploration Policies for Navigation
- 5.67, 0.94, [7, 5, 5] [3, 3, 4]Attentive Task-Agnostic Meta-Learning for Few-Shot Text Classification
- 5.67, 0.94, [5, 5, 7] [4, 3, 4]Open-Ended Content-Style Recombination Via Leakage Filtering
- 5.67, 2.36, [9, 4, 4] [4, 4, 4]Bayesian Modelling and Monte Carlo Inference for GAN
- 5.67, 0.47, [6, 5, 6] [4, 3, 3]Multi-objective training of Generative Adversarial Networks with multiple discriminators
- 5.67, 1.25, [4, 7, 6] [4, 3, 3]Knowledge Representation for Reinforcement Learning using General Value Functions
- 5.67, 0.47, [6, 6, 5] [5, 5, 3]Super-Resolution via Conditional Implicit Maximum Likelihood Estimation
- 5.67, 1.25, [7, 6, 4] [4, 4, 4]CoDraw: Collaborative Drawing as a Testbed for Grounded Goal-driven Communication
- 5.67, 1.25, [7, 4, 6] [4, 3, 5]NECST: Neural Joint Source-Channel Coding
- 5.67, 0.94, [7, 5, 5] [4, 3, 4]Nested Dithered Quantization for Communication Reduction in Distributed Training
- 5.67, 0.94, [5, 7, 5] [5, 3, 4]Explaining Image Classifiers by Counterfactual Generation
- 5.67, 1.25, [6, 7, 4] [5, 4, 4]The Expressive Power of Deep Neural Networks with Circulant Matrices
- 5.67, 0.47, [6, 5, 6] [3, 4, 4]Learning what you can do before doing anything
- 5.67, 0.47, [6, 6, 5] [4, 4, 4]Language Model Pre-training for Hierarchical Document Representations
- 5.67, 1.25, [6, 7, 4] [3, 4, 4]Efficient Augmentation via Data Subsampling
- 5.67, 0.47, [5, 6, 6] [4, 3, 4]Recurrent Kalman Networks: Factorized Inference in High-Dimensional Deep Feature Spaces
- 5.67, 0.94, [7, 5, 5] [4, 4, 4]Hierarchically-Structured Variational Autoencoders for Long Text Generation
- 5.67, 0.94, [5, 5, 7] [3, 4, 4]Where Off-Policy Deep Reinforcement Learning Fails
- 5.67, 1.25, [4, 7, 6] [4, 4, 4]TENSOR RING NETS ADAPTED DEEP MULTI-TASK LEARNING
- 5.67, 0.47, [6, 5, 6] [3, 5, 4]A Variational Dirichlet Framework for Out-of-Distribution Detection
- 5.67, 0.94, [5, 7, 5] [4, 3, 4]Adaptive Sample-space & Adaptive Probability coding: a neural-network based approach for compression
- 5.67, 1.70, [5, 8, 4] [4, 3, 4]Augment your batch: better training with larger batches
- 5.67, 0.94, [5, 7, 5] [4, 2, 5]On Difficulties of Probability Distillation
- 5.67, 0.47, [5, 6, 6] [2, 4, 3]Top-Down Neural Model For Formulae
- 5.67, 0.94, [5, 5, 7] [4, 4, 4]Manifold regularization with GANs for semi-supervised learning
- 5.67, 0.94, [5, 5, 7] [5, 3, 4]Cross-Task Knowledge Transfer for Visually-Grounded Navigation
- 5.67, 1.25, [6, 7, 4] [3, 2, 4]Rotation Equivariant Networks via Conic Convolution and the DFT
- 5.67, 1.89, [3, 7, 7] [5, 4, 5]Small steps and giant leaps: Minimal Newton solvers for Deep Learning
- 5.67, 1.25, [7, 4, 6] [4, 3, 4]Beyond Greedy Ranking: Slate Optimization via List-CVAE
- 5.67, 0.47, [5, 6, 6] [3, 4, 3]Efficient Multi-Objective Neural Architecture Search via Lamarckian Evolution
- 5.67, 0.47, [5, 6, 6] [4, 4, 4]Learning to Augment Influential Data
- 5.67, 1.25, [7, 4, 6] [3, 3, 3]Doubly Sparse: Sparse Mixture of Sparse Experts for Efficient Softmax Inference
- 5.67, 1.70, [8, 5, 4] [3, 3, 4]Cost-Sensitive Robustness against Adversarial Examples
- 5.67, 0.47, [6, 6, 5] [4, 1, 4]Learning to Design RNA
- 5.67, 1.25, [7, 4, 6] [3, 4, 3]Learning Procedural Abstractions and Evaluating Discrete Latent Temporal Structure
- 5.67, 0.94, [5, 5, 7] [3, 3, 3]Finite Automata Can be Linearly Decoded from Language-Recognizing RNNs
- 5.67, 0.47, [5, 6, 6] [5, 4, 4]Selfless Sequential Learning
- 5.67, 0.47, [6, 6, 5] [4, 4, 4]Modeling the Long Term Future in Model-Based Reinforcement Learning
- 5.67, 1.25, [7, 4, 6] [3, 4, 4]Poincare Glove: Hyperbolic Word Embeddings
- 5.67, 0.47, [6, 5, 6] [5, 5, 4]Rethinking the Value of Network Pruning
- 5.67, 0.94, [5, 5, 7] [2, 4, 4]DL2: Training and Querying Neural Networks with Logic
- 5.50, 1.12, [7, 5, 4, 6] [4, 4, 4, 4]Computing committor functions for the study of rare events using deep learning with importance sampling
- 5.50, 0.50, [5, 6, 6, 5] [4, 4, 3, 4]Interactive Agent Modeling by Learning to Probe
- 5.50, 0.87, [6, 6, 6, 4] [2, 2, 3, 4]Multi-way Encoding for Robustness to Adversarial Attacks
- 5.50, 0.87, [7, 5, 5, 5] [3, 3, 4, 4]On the Margin Theory of Feedforward Neural Networks
- 5.50, 0.87, [6, 6, 6, 4] [2, 2, 4, 5]CAML: Fast Context Adaptation via Meta-Learning
- 5.50, 0.50, [5, 6] [3, 2]Policy Optimization via Stochastic Recursive Gradient Algorithm
- 5.33, 0.47, [6, 5, 5] [3, 4, 3]The Universal Approximation Power of Finite-Width Deep ReLU Networks
- 5.33, 0.47, [5, 6, 5] [3, 3, 5]Classification from Positive, Unlabeled and Biased Negative Data
- 5.33, 1.25, [4, 7, 5] [3, 4, 3]Convolutional Neural Networks on Non-uniform Geometrical Signals Using Euclidean Spectral Transformation
- 5.33, 0.47, [5, 6, 5] [3, 4, 4]Understanding Straight-Through Estimator in Training Activation Quantized Neural Nets
- 5.33, 0.94, [6, 6, 4] [4, 2, 3]Lipschitz regularized Deep Neural Networks converge and generalize
- 5.33, 0.47, [5, 5, 6] [3, 4, 1]Playing the Game of Universal Adversarial Perturbations
- 5.33, 0.94, [4, 6, 6] [4, 3, 3]Provable Guarantees on Learning Hierarchical Generative Models with Deep CNNs
- 5.33, 2.49, [6, 8, 2] [4, 4, 4]Caveats for information bottleneck in deterministic scenarios
- 5.33, 1.25, [5, 4, 7] [4, 4, 4]Clinical Risk: wavelet reconstruction networks for marked point processes
- 5.33, 1.70, [7, 6, 3] [3, 4, 2] The relativistic discriminator: a key element missing from standard GAN
- 5.33, 0.47, [5, 6, 5] [4, 3, 5]On the Ineffectiveness of Variance Reduced Optimization for Deep Learning
- 5.33, 0.47, [5, 5, 6] [4, 2, 4]Adaptive Pruning of Neural Language Models for Mobile Devices
- 5.33, 0.94, [6, 4, 6] [3, 4, 4]Reducing Overconfident Errors outside the Known Distribution
- 5.33, 0.94, [6, 4, 6] [4, 5, 4]Learning to Understand Goal Specifications by Modelling Reward
- 5.33, 0.94, [6, 4, 6] [3, 4, 5]LEARNING FACTORIZED REPRESENTATIONS FOR OPEN-SET DOMAIN ADAPTATION
- 5.33, 0.47, [5, 5, 6] [4, 4, 4]SOSELETO: A Unified Approach to Transfer Learning and Training with Noisy Labels
- 5.33, 0.47, [5, 5, 6] [2, 4, 3]An experimental study of layer-level training speed and its impact on generalization
- 5.33, 0.47, [6, 5, 5] [4, 4, 3]Perfect Match: A Simple Method for Learning Representations For Counterfactual Inference With Neural Networks
- 5.33, 1.89, [4, 4, 8] [4, 4, 3]DecayNet: A Study on the Cell States of Long Short Term Memories
- 5.33, 0.47, [5, 5, 6] [3, 4, 3]Training generative latent models by variational f-divergence minimization
- 5.33, 1.25, [5, 7, 4] [4, 5, 5]Domain Generalization via Invariant Representation under Domain-Class Dependency
- 5.33, 1.25, [5, 7, 4] [4, 3, 4]Distribution-Interpolation Trade off in Generative Models
- 5.33, 0.47, [5, 6, 5] [5, 2, 3]Purchase as Reward : Session-based Recommendation by Imagination Reconstruction
- 5.33, 0.47, [6, 5, 5] [3, 3, 4]Learning to Separate Domains in Generalized Zero-Shot and Open Set Learning: a probabilistic perspective
- 5.33, 0.94, [6, 6, 4] [3, 3, 2]Exploring and Enhancing the Transferability of Adversarial Examples
- 5.33, 1.25, [7, 4, 5] [5, 4, 3]Switching Linear Dynamics for Variational Bayes Filtering
- 5.33, 1.25, [5, 7, 4] [4, 3, 4]The loss landscape of overparameterized neural networks
- 5.33, 0.94, [6, 4, 6] [4, 4, 3]Curiosity-Driven Experience Prioritization via Density Estimation
- 5.33, 0.47, [5, 5, 6] [5, 5, 3]Generalization and Regularization in DQN
- 5.33, 1.25, [4, 5, 7] [3, 5, 2]Invariant-equivariant representation learning for multi-class data
- 5.33, 2.62, [9, 3, 4] [4, 5, 4] Large-Scale Visual Speech Recognition
- 5.33, 0.47, [5, 5, 6] [4, 4, 4]RoC-GAN: Robust Conditional GAN
- 5.33, 1.25, [7, 5, 4] [2, 2, 2]On the Turing Completeness of Modern Neural Network Architectures
- 5.33, 0.47, [5, 6, 5] [4, 4, 2]The Unusual Effectiveness of Averaging in GAN Training
- 5.33, 0.94, [6, 6, 4] [4, 5, 4]Graph Wavelet Neural Network
- 5.33, 0.47, [6, 5, 5] [4, 4, 5]Gaussian-gated LSTM: Improved convergence by reducing state updates
- 5.33, 0.94, [6, 4, 6] [3, 3, 3]Meta Domain Adaptation: Meta-Learning for Few-Shot Learning under Domain Shift
- 5.33, 0.47, [5, 5, 6] [4, 4, 5]Learning to encode spatial relations from natural language
- 5.33, 0.47, [6, 5, 5] [3, 3, 3]Skip-gram word embeddings in hyperbolic space
- 5.33, 0.47, [6, 5, 5] [4, 4, 4]Graph Matching Networks for Learning the Similarity of Graph Structured Objects
- 5.33, 1.25, [5, 7, 4] [4, 4, 3]Learning to Coordinate Multiple Reinforcement Learning Agents for Diverse Query Reformulation
- 5.33, 1.70, [7, 3, 6] [4, 5, 4]Learning From the Experience of Others: Approximate Empirical Bayes in Neural Networks
- 5.33, 2.05, [8, 3, 5] [4, 5, 4]DiffraNet: Automatic Classification of Serial Crystallography Diffraction Patterns
- 5.33, 0.47, [6, 5, 5] [4, 3, 3]Improving Composition of Sentence Embeddings through the Lens of Statistical Relational Learning
- 5.33, 0.94, [6, 6, 4] [4, 3, 4]Learning to Generate Parameters from Natural Languages for Graph Neural Networks
- 5.33, 1.25, [7, 5, 4] [5, 3, 4]Adaptive Neural Trees
- 5.33, 0.47, [6, 5, 5] [2, 3, 3]Learning Internal Dense But External Sparse Structures of Deep Neural Network
- 5.33, 1.25, [5, 7, 4] [4, 3, 2]DOMAIN ADAPTATION VIA DISTRIBUTION AND REPRESENTATION MATCHING: A CASE STUDY ON TRAINING DATA SELECTION VIA REINFORCEMENT LEARNING
- 5.33, 1.25, [4, 5, 7] [5, 4, 4]Unseen Action Recognition with Multimodal Learning
- 5.33, 0.47, [5, 5, 6] [3, 4, 3]Equi-normalization of Neural Networks
- 5.33, 0.47, [5, 5, 6] [5, 4, 2]Adversarial Sampling for Active Learning
- 5.33, 1.70, [6, 7, 3] [5, 4, 3]CEM-RL: Combining evolutionary and gradient-based methods for policy search
- 5.33, 1.25, [7, 5, 4] [4, 3, 4]Overcoming Catastrophic Forgetting via Model Adaptation
- 5.33, 0.47, [6, 5, 5] [3, 4, 4]Hierarchically Clustered Representation Learning
- 5.33, 1.89, [8, 4, 4] [4, 4, 5]Neural Causal Discovery with Learnable Input Noise
- 5.33, 0.47, [5, 6, 5] [5, 3, 4]h-detach: Modifying the LSTM Gradient Towards Better Optimization
- 5.33, 0.94, [4, 6, 6] [3, 4, 4]Structured Neural Summarization
- 5.33, 0.94, [4, 6, 6] [3, 4, 4]Soft Q-Learning with Mutual-Information Regularization
- 5.33, 0.47, [5, 6, 5] [5, 3, 4]Set Transformer
- 5.33, 0.47, [5, 5, 6] [3, 4, 4]Learning data-derived privacy preserving representations from information metrics
- 5.33, 0.47, [6, 5, 5] [4, 4, 2]EDDI: Efficient Dynamic Discovery of High-Value Information with Partial VAE
- 5.33, 0.47, [5, 6, 5] [3, 2, 3]Negotiating Team Formation Using Deep Reinforcement Learning
- 5.33, 1.25, [4, 7, 5] [4, 3, 3]Stackelberg GAN: Towards Provable Minimax Equilibrium via Multi-Generator Architectures
- 5.33, 0.47, [5, 5, 6] [4, 4, 4]Lorentzian Distance Learning
- 5.33, 1.70, [6, 7, 3] [2, 4, 2]Cohen Welling bases & SO(2)-Equivariant classifiers using Tensor nonlinearity.
- 5.33, 0.47, [5, 5, 6] [4, 5, 4]EnGAN: Latent Space MCMC and Maximum Entropy Generators for Energy-based Models
- 5.33, 0.47, [5, 6, 5] [5, 4, 4]Exploring Curvature Noise in Large-Batch Stochastic Optimization
- 5.33, 0.94, [4, 6, 6] [4, 4, 4]Transformer-XL: Language Modeling with Longer-Term Dependency
- 5.33, 0.47, [5, 5, 6] [3, 3, 3]The Case for Full-Matrix Adaptive Regularization
- 5.33, 0.94, [6, 6, 4] [4, 5, 5]BLISS in Non-Isometric Embedding Spaces
- 5.33, 1.70, [6, 3, 7] [4, 2, 4]Learning-Based Frequency Estimation Algorithms
- 5.33, 0.94, [4, 6, 6] [4, 3, 4]Hint-based Training for Non-Autoregressive Translation
- 5.33, 2.62, [9, 4, 3] [4, 4, 5]An adaptive homeostatic algorithm for the unsupervised learning of visual features
- 5.33, 1.89, [4, 8, 4] [3, 4, 4]A Deep Learning Approach for Dynamic Survival Analysis with Competing Risks
- 5.33, 0.94, [6, 6, 4] [3, 4, 4]Knowledge Distillation from Few Samples
- 5.33, 0.47, [6, 5, 5] [4, 4, 2]Measuring and regularizing networks in function space
- 5.33, 0.47, [5, 6, 5] [4, 3, 4]Graph Classification with Geometric Scattering
- 5.33, 0.47, [5, 5, 6] [3, 3, 2]Selective Convolutional Units: Improving CNNs via Channel Selectivity
- 5.33, 0.47, [5, 5, 6] [4, 3, 5]Learning to Decompose Compound Questions with Reinforcement Learning
- 5.33, 0.47, [5, 6, 5] [4, 2, 4]Infinitely Deep Infinite-Width Networks
- 5.33, 0.47, [5, 5, 6] [4, 3, 4]State-Denoised Recurrent Neural Networks
- 5.33, 0.47, [5, 6, 5] [4, 4, 4]Scalable Unbalanced Optimal Transport using Generative Adversarial Networks
- 5.33, 0.47, [5, 6, 5] [4, 5, 4]CDeepEx: Contrastive Deep Explanations
- 5.33, 1.25, [4, 5, 7] [3, 3, 5]Verification of Non-Linear Specifications for Neural Networks
- 5.33, 1.25, [7, 5, 4] [4, 4, 5]LEARNING GENERATIVE MODELS FOR DEMIXING OF STRUCTURED SIGNALS FROM THEIR SUPERPOSITION USING GANS
- 5.33, 0.47, [5, 6, 5] [4, 3, 3]Learning State Representations in Complex Systems with Multimodal Data
- 5.33, 1.70, [3, 7, 6] [3, 3, 3]Transfer and Exploration via the Information Bottleneck
- 5.33, 0.47, [6, 5, 5] [4, 3, 3]Unsupervised Conditional Generation using noise engineered mode matching GAN
- 5.33, 0.94, [6, 4, 6] [4, 3, 3]Learning to Describe Scenes with Programs
- 5.33, 2.05, [8, 5, 3] [4, 3, 4]Human-level Protein Localization with Convolutional Neural Networks
- 5.33, 1.70, [3, 7, 6] [5, 5, 3]Improved Language Modeling by Decoding the Past
- 5.33, 0.47, [5, 5, 6] [3, 3, 4]Amortized Bayesian Meta-Learning
- 5.33, 1.25, [7, 4, 5] [4, 5, 4]Coverage and Quality Driven Training of Generative Image Models
- 5.33, 0.47, [5, 5, 6] [3, 5, 3]Learning space time dynamics with PDE guided neural networks
- 5.33, 0.94, [6, 4, 6] [3, 3, 3]NLProlog: Reasoning with Weak Unification for Natural Language Question Answering
- 5.33, 0.94, [4, 6, 6] [4, 3, 3]Actor-Attention-Critic for Multi-Agent Reinforcement Learning
- 5.33, 1.25, [4, 5, 7] [4, 3, 4]Deep learning generalizes because the parameter-function map is biased towards simple functions
- 5.33, 1.25, [7, 4, 5] [4, 3, 4]Learning protein sequence embeddings using information from structure
- 5.33, 0.47, [5, 6, 5] [4, 3, 3]Meta Learning with Fast/Slow Learners
- 5.33, 1.70, [3, 6, 7] [4, 4, 4]Meta-Learning for Contextual Bandit Exploration
- 5.33, 1.25, [4, 7, 5] [3, 4, 5]Understanding & Generalizing AlphaGo Zero
- 5.33, 1.25, [4, 7, 5] [3, 4, 4]Random mesh projectors for inverse problems
- 5.33, 1.89, [8, 4, 4] [4, 4, 5]Deep Anomaly Detection with Outlier Exposure
- 5.33, 0.47, [5, 6, 5] [4, 4, 4]Probabilistic Model-Based Dynamic Architecture Search
- 5.33, 0.47, [6, 5, 5] [4, 2, 3]Mimicking actions is a good strategy for beginners: Fast Reinforcement Learning with Expert Action Sequences
- 5.33, 1.25, [5, 7, 4] [5, 5, 5]Universal Successor Features for Transfer Reinforcement Learning
- 5.33, 1.25, [7, 5, 4] [4, 4, 4]Combining Neural Networks with Personalized PageRank for Classification on Graphs
- 5.33, 1.25, [5, 4, 7] [4, 5, 4]AIM: Adversarial Inference by Matching Priors and Conditionals
- 5.33, 1.25, [4, 7, 5] [3, 4, 4]DON’T JUDGE A BOOK BY ITS COVER – ON THE DYNAMICS OF RECURRENT NEURAL NETWORKS
- 5.33, 1.25, [4, 7, 5] [3, 4, 4]The Nonlinearity Coefficient – Predicting Generalization in Deep Neural Networks
- 5.33, 1.25, [7, 4, 5] [3, 4, 4]Multi-task Learning with Gradient Communication
- 5.33, 1.25, [5, 4, 7] [3, 3, 4]Stochastic Gradient/Mirror Descent: Minimax Optimality and Implicit Regularization
- 5.33, 1.25, [4, 7, 5] [4, 4, 4]DHER: Hindsight Experience Replay for Dynamic Goals
- 5.33, 1.25, [5, 7, 4] [3, 4, 4]I Know the Feeling: Learning to Converse with Empathy
- 5.33, 0.47, [6, 5, 5] [3, 4, 4]Towards Decomposed Linguistic Representation with Holographic Reduced Representation
- 5.33, 2.05, [5, 3, 8] [4, 5, 4]Heated-Up Softmax Embedding
- 5.33, 1.89, [8, 4, 4] [2, 4, 4]Advocacy Learning
- 5.33, 1.25, [4, 7, 5] [4, 4, 3]A Modern Take on the Bias-Variance Tradeoff in Neural Networks
- 5.33, 0.47, [6, 5, 5] [2, 4, 3]Surprising Negative Results for Generative Adversarial Tree Search
- 5.33, 0.47, [5, 5, 6] [5, 3, 5]Exploring the interpretability of LSTM neural networks over multi-variable data
- 5.33, 0.94, [6, 6, 4] [4, 4, 3]Probabilistic Federated Neural Matching
- 5.33, 0.47, [5, 5, 6] [3, 3, 4]Importance Resampling for Off-policy Policy Evaluation
- 5.33, 0.47, [6, 5, 5] [1, 3, 5]Deep Imitative Models for Flexible Inference, Planning, and Control
- 5.33, 0.47, [6, 5, 5] [4, 4, 3]Complementary-label learning for arbitrary losses and models
- 5.33, 0.47, [5, 5, 6] [4, 4, 4]Online Hyperparameter Adaptation via Amortized Proximal Optimization
- 5.33, 0.47, [6, 5, 5] [4, 4, 2]DEEP GRAPH TRANSLATION
- 5.33, 0.94, [6, 6, 4] [3, 4, 5]Adapting Auxiliary Losses Using Gradient Similarity
- 5.33, 3.09, [7, 8, 1] [4, 4, 3]Optimal Control Via Neural Networks: A Convex Approach
- 5.33, 1.25, [5, 7, 4] [4, 3, 3]Composing Entropic Policies using Divergence Correction
- 5.33, 1.25, [5, 7, 4] [3, 4, 3]Neural Predictive Belief Representations
- 5.33, 0.47, [5, 5, 6] [3, 4, 4]Learning Backpropagation-Free Deep Architectures with Kernels
- 5.33, 0.94, [4, 6, 6] [4, 4, 4]Can I trust you more? Model-Agnostic Hierarchical Explanations
- 5.33, 0.47, [5, 6, 5] [3, 4, 5]Open Loop Hyperparameter Optimization and Determinantal Point Processes
- 5.33, 1.25, [4, 5, 7] [4, 4, 3]Sorting out Lipschitz function approximation
- 5.33, 0.47, [5, 5, 6] [5, 3, 4]Knows When it Doesn’t Know: Deep Abstaining Classifiers
- 5.33, 0.47, [5, 6, 5] [3, 3, 2]Probabilistic Knowledge Graph Embeddings
- 5.33, 0.47, [5, 6, 5] [4, 3, 3]An Active Learning Framework for Efficient Robust Policy Search
- 5.33, 0.47, [5, 5, 6] [4, 2, 2]Tree-Structured Recurrent Switching Linear Dynamical Systems for Multi-Scale Modeling
- 5.33, 0.47, [6, 5, 5] [2, 4, 3]Uncovering Surprising Behaviors in Reinforcement Learning via Worst-case Analysis
- 5.33, 0.47, [5, 6, 5] [5, 3, 4]Flow++: Improving Flow-Based Generative Models with Variational Dequantization and Architecture Design
- 5.33, 0.94, [4, 6, 6] [4, 3, 4]Meta-Learning Language-Guided Policy Learning
- 5.33, 0.47, [5, 6, 5] [5, 3, 4]Neural Model-Based Reinforcement Learning for Recommendation
- 5.33, 0.47, [5, 6, 5] [3, 3, 3]MahiNet: A Neural Network for Many-Class Few-Shot Learning with Class Hierarchy
- 5.33, 0.94, [4, 6, 6] [4, 3, 4]IB-GAN: Disentangled Representation Learning with Information Bottleneck GAN
- 5.33, 0.47, [5, 5, 6] [4, 2, 5]AntMan: Sparse Low-Rank Compression To Accelerate RNN Inference
- 5.33, 0.94, [4, 6, 6] [4, 2, 4]Multi-Agent Dual Learning
- 5.33, 1.25, [4, 7, 5] [4, 4, 4]Search-Guided, Lightly-supervised Training of Structured Prediction Energy Networks
- 5.33, 0.47, [6, 5, 5] [3, 4, 3]Learning Particle Dynamics for Manipulating Rigid Bodies, Deformable Objects, and Fluids
- 5.33, 0.94, [4, 6, 6] [5, 3, 3]Simple Black-box Adversarial Attacks
- 5.33, 0.47, [5, 5, 6] [4, 4, 4]Interpolation-Prediction Networks for Irregularly Sampled Time Series
- 5.33, 1.25, [4, 7, 5] [4, 5, 4]SynonymNet: Multi-context Bilateral Matching for Entity Synonyms
- 5.33, 1.25, [4, 5, 7] [2, 5, 3]Synthetic Datasets for Neural Program Synthesis
- 5.33, 0.94, [4, 6, 6] [4, 3, 3]Generative Adversarial Networks for Extreme Learned Image Compression
- 5.33, 0.47, [5, 6, 5] [4, 4, 5]Local Binary Pattern Networks for Character Recognition
- 5.25, 1.30, [7, 4, 6, 4] [2, 4, 3, 4]Unified recurrent network for many feature types
- 5.25, 0.43, [5, 5, 6, 5] [5, 5, 5, 4]Sample Efficient Imitation Learning for Continuous Control
- 5.25, 1.09, [4, 7, 5, 5] [5, 3, 4, 3]Improving Generative Adversarial Imitation Learning with Non-expert Demonstrations
- 5.25, 0.83, [6, 5, 6, 4] [3, 3, 3, 4]Generative Feature Matching Networks
- 5.25, 0.83, [6, 5, 6, 4] [4, 4, 4, 3]Convergent Reinforcement Learning with Function Approximation: A Bilevel Optimization Perspective
- 5.25, 0.83, [4, 5, 6, 6] [4, 4, 2, 4]Optimistic Acceleration for Optimization
- 5.25, 1.09, [5, 5, 7, 4] [5, 3, 4, 4]P^2IR: Universal Deep Node Representation via Partial Permutation Invariant Set Functions
- 5.00, 0.82, [6, 4, 5] [3, 4, 4]Towards Language Agnostic Universal Representations
- 5.00, 1.63, [3, 5, 7] [3, 4, 3]Transfer Learning for Estimating Causal Effects Using Neural Networks
- 5.00, 1.63, [7, 3, 5] [4, 5, 4]Reduced-Gate Convolutional LSTM Design Using Predictive Coding for Next-Frame Video Prediction
- 5.00, 1.41, [7, 4, 4] [3, 4, 4]Metric-Optimized Example Weights
- 5.00, 0.00, [5, 5, 5] [4, 4, 3]Quantization for Rapid Deployment of Deep Neural Networks
- 5.00, 0.00, [5, 5, 5] [4, 3, 4]Excitation Dropout: Encouraging Plasticity in Deep Neural Networks
- 5.00, 0.00, [5, 5, 5] [3, 4, 4]Convergence Properties of Deep Neural Networks on Separable Data
- 5.00, 1.41, [4, 4, 7] [4, 3, 4]k-Nearest Neighbors by Means of Sequence to Sequence Deep Neural Networks and Memory Networks
- 5.00, 1.41, [4, 4, 7] [4, 3, 4]Stop memorizing: A data-dependent regularization framework for intrinsic pattern learning
- 5.00, 0.82, [5, 4, 6] [5, 4, 4]Collapse of deep and narrow neural nets
- 5.00, 0.82, [6, 5, 4] [4, 4, 2]Déjà Vu: An Empirical Evaluation of the Memorization Properties of Convnets
- 5.00, 1.41, [7, 4, 4] [4, 5, 3]Adversarial Reprogramming of Neural Networks
- 5.00, 0.82, [6, 4, 5] [4, 4, 4]Spread Divergences
- 5.00, 0.82, [5, 4, 6] [4, 4, 4]Massively Parallel Hyperparameter Tuning
- 5.00, 0.82, [4, 5, 6] [4, 3, 3]Using Ontologies To Improve Performance In Massively Multi-label Prediction
- 5.00, 0.82, [4, 6, 5] [4, 5, 4]FAVAE: SEQUENCE DISENTANGLEMENT USING IN- FORMATION BOTTLENECK PRINCIPLE
- 5.00, 0.82, [6, 4, 5] [4, 3, 3]Learning Neuron Non-Linearities with Kernel-Based Deep Neural Networks
- 5.00, 0.00, [5, 5, 5] [4, 5, 4]GRAPH TRANSFORMATION POLICY NETWORK FOR CHEMICAL REACTION PREDICTION
- 5.00, 1.41, [7, 4, 4] [4, 4, 3]Discrete flow posteriors for variational inference in discrete dynamical systems
- 5.00, 0.82, [4, 6, 5] [3, 3, 4]Strength in Numbers: Trading-off Robustness and Computation via Adversarially-Trained Ensembles
- 5.00, 0.00, [5, 5, 5] [4, 4, 3]Improving Gaussian mixture latent variable model convergence with Optimal Transport
- 5.00, 0.00, [5, 5, 5] [3, 5, 2]Volumetric Convolution: Automatic Representation Learning in Unit Ball
- 5.00, 0.82, [4, 5, 6] [3, 3, 3]Directional Analysis of Stochastic Gradient Descent via von Mises-Fisher Distributions in Deep Learning
- 5.00, 0.82, [6, 5, 4] [3, 4, 3]Convolutional Neural Networks combined with Runge-Kutta Methods
- 5.00, 0.82, [4, 5, 6] [3, 4, 4]Cumulative Saliency based Globally Balanced Filter Pruning For Efficient Convolutional Neural Networks
- 5.00, 2.16, [4, 8, 3] [5, 5, 3]Initialized Equilibrium Propagation for Backprop-Free Training
- 5.00, 2.16, [2, 6, 7] [5, 4, 4]Variational Smoothing in Recurrent Neural Network Language Models
- 5.00, 0.82, [4, 6, 5] [4, 5, 3]An Automatic Operation Batching Strategy for the Backward Propagation of Neural Networks Having Dynamic Computation Graphs
- 5.00, 0.82, [5, 6, 4] [4, 2, 3]Low Latency Privacy Preserving Inference
- 5.00, 0.82, [4, 6, 5] [4, 3, 5]Optimal margin Distribution Network
- 5.00, 0.82, [5, 6, 4] [4, 3, 5]Pyramid Recurrent Neural Networks for Multi-Scale Change-Point Detection
- 5.00, 0.00, [5, 5, 5] [4, 4, 5]Learning Discriminators as Energy Networks in Adversarial Learning
- 5.00, 0.00, [5, 5, 5] [4, 4, 4]S3TA: A Soft, Spatial, Sequential, Top-Down Attention Model
- 5.00, 0.00, [5, 5, 5] [4, 4, 3]RedSync : Reducing Synchronization Traffic for Distributed Deep Learning
- 5.00, 1.41, [4, 4, 7] [3, 4, 4]Accelerated Value Iteration via Anderson Mixing
- 5.00, 0.82, [6, 5, 4] [4, 4, 4]On the Relationship between Neural Machine Translation and Word Alignment
- 5.00, 0.82, [5, 6, 4] [4, 4, 4]Denoise while Aggregating: Collaborative Learning in Open-Domain Question Answering
- 5.00, 0.82, [6, 5, 4] [4, 4, 5]Unicorn: Continual learning with a universal, off-policy agent
- 5.00, 0.82, [5, 6, 4] [4, 3, 5]SnapQuant: A Probabilistic and Nested Parameterization for Binary Networks
- 5.00, 0.82, [6, 4, 5] [3, 3, 3]Spatial-Winograd Pruning Enabling Sparse Winograd Convolution
- 5.00, 1.63, [7, 3, 5] [4, 4, 4]On Accurate Evaluation of GANs for Language Generation
- 5.00, 1.41, [4, 7, 4] [5, 2, 3]Cautious Deep Learning
- 5.00, 1.41, [7, 4, 4] [5, 3, 5]A Variational Autoencoder for Probabilistic Non-Negative Matrix Factorisation
- 5.00, 0.82, [4, 6, 5] [4, 3, 4]Likelihood-based Permutation Invariant Loss Function for Probability Distributions
- 5.00, 0.82, [5, 4, 6] [4, 4, 3]The Effectiveness of Pre-Trained Code Embeddings
- 5.00, 0.82, [4, 5, 6] [4, 4, 3]Unsupervised Document Representation using Partition Word-Vectors Averaging
- 5.00, 0.00, [5, 5, 5] [4, 3, 4]Ada-Boundary: Accelerating the DNN Training via Adaptive Boundary Batch Selection
- 5.00, 0.82, [6, 4, 5] [4, 4, 4]Interactive Parallel Exploration for Reinforcement Learning in Continuous Action Spaces
- 5.00, 0.00, [5, 5, 5] [4, 3, 3]Revisiting Reweighted Wake-Sleep
- 5.00, 0.82, [4, 5, 6] [4, 4, 4]Teacher Guided Architecture Search
- 5.00, 0.82, [6, 4, 5] [4, 3, 4]What Would pi* Do?: Imitation Learning via Off-Policy Reinforcement Learning
- 5.00, 0.82, [5, 6, 4] [4, 3, 5]Connecting the Dots Between MLE and RL for Sequence Generation
- 5.00, 0.82, [5, 4, 6] [4, 4, 2]Consistent Jumpy Predictions for Videos and Scenes
- 5.00, 0.00, [5, 5, 5] [5, 5, 4]Phrase-Based Attentions
- 5.00, 0.82, [5, 6, 4] [4, 3, 5]On-Policy Trust Region Policy Optimisation with Replay Buffers
- 5.00, 0.00, [5, 5, 5] [3, 4, 3]Capacity of Deep Neural Networks under Parameter Quantization
- 5.00, 1.41, [4, 4, 7] [4, 4, 3]Probabilistic Semantic Embedding
- 5.00, 1.41, [4, 4, 7] [4, 3, 3]The Importance of Norm Regularization in Linear Graph Embedding: Theoretical Analysis and Empirical Demonstration
- 5.00, 0.00, [5, 5, 5] [3, 4, 3]Weakly-supervised Knowledge Graph Alignment with Adversarial Learning
- 5.00, 0.82, [6, 5, 4] [4, 4, 3]Point Cloud GAN
- 5.00, 0.82, [5, 4, 6] [4, 4, 5]Neural Persistence: A Complexity Measure for Deep Neural Networks Using Algebraic Topology
- 5.00, 0.00, [5, 5, 5] [4, 4, 4]Dataset Distillation
- 5.00, 0.82, [6, 4, 5] [2, 4, 4]Representation-Constrained Autoencoders and an Application to Wireless Positioning
- 5.00, 0.82, [5, 6, 4] [4, 3, 5]The Anisotropic Noise in Stochastic Gradient Descent: Its Behavior of Escaping from Minima and Regularization Effects
- 5.00, 0.82, [6, 4, 5] [4, 5, 5]A Case for Object Compositionality in Deep Generative Models of Images
- 5.00, 0.00, [5, 5, 5] [5, 4, 3]An Efficient and Margin-Approaching Zero-Confidence Adversarial Attack
- 5.00, 0.82, [5, 4, 6] [4, 4, 3]COLLABORATIVE MULTIAGENT REINFORCEMENT LEARNING IN HOMOGENEOUS SWARMS
- 5.00, 0.00, [5, 5, 5] [2, 4, 4]Deep Recurrent Gaussian Process with Variational Sparse Spectrum Approximation
- 5.00, 0.00, [5, 5, 5] [4, 3, 3]Transferrable End-to-End Learning for Protein Interface Prediction
- 5.00, 1.41, [3, 6, 6] [3, 3, 1]Improved robustness to adversarial examples using Lipschitz regularization of the loss
- 5.00, 0.00, [5, 5, 5] [5, 4, 3]Dense Morphological Network: An Universal Function Approximator
- 5.00, 0.00, [5, 5, 5] [5, 2, 5]High Resolution and Fast Face Completion via Progressively Attentive GANs
- 5.00, 0.00, [5, 5, 5] [1, 3, 3]Model Comparison for Semantic Grouping
- 5.00, 0.82, [4, 6, 5] [4, 3, 4]Learning to Refer to 3D Objects with Natural Language
- 5.00, 0.82, [4, 5, 6] [4, 3, 4]Dissecting an Adversarial framework for Information Retrieval
- 5.00, 0.82, [4, 6, 5] [4, 3, 5]NETWORK COMPRESSION USING CORRELATION ANALYSIS OF LAYER RESPONSES
- 5.00, 1.41, [7, 4, 4] [4, 4, 5]On Learning Heteroscedastic Noise Models within Differentiable Bayes Filters
- 5.00, 0.82, [4, 5, 6] [4, 4, 5]Physiological Signal Embeddings (PHASE) via Interpretable Stacked Models
- 5.00, 0.00, [5, 5, 5] [5, 4, 4]A PRIVACY-PRESERVING IMAGE CLASSIFICATION FRAMEWORK WITH A LEARNABLE OBFUSCATOR
- 5.00, 0.00, [5, 5, 5] [4, 4, 4]Learning with Random Learning Rates.
- 5.00, 0.82, [4, 6, 5] [5, 4, 5]Canonical Correlation Analysis with Implicit Distributions
- 5.00, 1.63, [3, 5, 7] [3, 4, 5]Guided Exploration in Deep Reinforcement Learning
- 5.00, 1.41, [7, 4, 4] [4, 2, 3]The GAN Landscape: Losses, Architectures, Regularization, and Normalization
- 5.00, 1.63, [5, 3, 7] [3, 4, 3]TherML: The Thermodynamics of Machine Learning
- 5.00, 0.82, [4, 5, 6] [4, 5, 3]Graph2Seq: Scalable Learning Dynamics for Graphs
- 5.00, 0.82, [5, 6, 4] [5, 4, 4]ChoiceNet: Robust Learning by Revealing Output Correlations
- 5.00, 1.41, [7, 4, 4] [5, 4, 4]N-Ary Quantization for CNN Model Compression and Inference Acceleration
- 5.00, 0.82, [5, 6, 4] [2, 4, 2]Automata Guided Skill Composition
- 5.00, 0.82, [5, 6, 4] [5, 2, 5]Learning To Plan
- 5.00, 2.16, [6, 7, 2] [3, 4, 3]Implicit Autoencoders
- 5.00, 0.82, [4, 6, 5] [5, 4, 4]COCO-GAN: Conditional Coordinate Generative Adversarial Network
- 5.00, 0.82, [6, 4, 5] [5, 2, 4]Bayesian Deep Learning via Stochastic Gradient MCMC with a Stochastic Approximation Adaptation
- 5.00, 1.41, [7, 4, 4] [3, 4, 3]Generative Ensembles for Robust Anomaly Detection
- 5.00, 0.00, [5, 5, 5] [3, 3, 5]Characterizing Malicious Edges targeting on Graph Neural Networks
- 5.00, 0.82, [5, 6, 4] [4, 3, 5]Zero-shot Dual Machine Translation
- 5.00, 0.00, [5, 5, 5] [4, 3, 4]Inferring Reward Functions from Demonstrators with Unknown Biases
- 5.00, 0.00, [5, 5, 5] [3, 4, 4]A comprehensive, application-oriented study of catastrophic forgetting in DNNs
- 5.00, 0.82, [5, 6, 4] [4, 4, 5]Deep Reinforcement Learning of Universal Policies with Diverse Environment Summaries
- 5.00, 2.16, [2, 7, 6] [4, 3, 3]RANDOM MASK: Towards Robust Convolutional Neural Networks
- 5.00, 0.00, [5, 5, 5] [4, 5, 5]Bias Also Matters: Bias Attribution for Deep Neural Network Explanation
- 5.00, 0.82, [6, 4, 5] [4, 4, 2]Label Propagation Networks
- 5.00, 1.63, [5, 3, 7] [3, 4, 2]Multi-agent Deep Reinforcement Learning with Extremely Noisy Observations
- 5.00, 0.82, [6, 4, 5] [4, 5, 5]Learning Global Additive Explanations for Neural Nets Using Model Distillation
- 5.00, 0.82, [6, 5, 4] [3, 3, 4]Understand the dynamics of GANs via Primal-Dual Optimization
- 5.00, 0.82, [6, 4, 5] [4, 4, 4]Rethinking learning rate schedules for stochastic optimization
- 5.00, 1.41, [4, 7, 4] [4, 3, 4]Learning and Planning with a Semantic Model
- 5.00, 0.00, [5, 5, 5] [3, 4, 3]Metropolis-Hastings view on variational inference and adversarial training
- 5.00, 0.82, [6, 5, 4] [4, 5, 4]Learning To Simulate
- 5.00, 1.41, [3, 6, 6] [5, 4, 4]Graph2Seq: Graph to Sequence Learning with Attention-Based Neural Networks
- 5.00, 0.82, [4, 6, 5] [3, 3, 4]Information Regularized Neural Networks
- 5.00, 0.82, [5, 4, 6] [4, 4, 3]Transfer Learning for Sequences via Learning to Collocate
- 5.00, 0.82, [6, 4, 5] [4, 3, 3]Guided Evolutionary Strategies: Escaping the curse of dimensionality in random search
- 5.00, 0.82, [5, 6, 4] [5, 4, 3]Quality Evaluation of GANs Using Cross Local Intrinsic Dimensionality
- 5.00, 0.82, [4, 6, 5] [4, 4, 4]Learning Actionable Representations with Goal Conditioned Policies
- 5.00, 1.41, [7, 4, 4] [3, 4, 2]Shrinkage-based Bias-Variance Trade-off for Deep Reinforcement Learning
- 5.00, 1.41, [4, 4, 7] [4, 4, 4]A RECURRENT NEURAL CASCADE-BASED MODEL FOR CONTINUOUS-TIME DIFFUSION PROCESS
- 5.00, 0.00, [5, 5, 5] [4, 4, 4]ON THE EFFECTIVENESS OF TASK GRANULARITY FOR TRANSFER LEARNING
- 5.00, 1.41, [4, 4, 7] [5, 3, 3]NATTACK: A STRONG AND UNIVERSAL GAUSSIAN BLACK-BOX ADVERSARIAL ATTACK
- 5.00, 0.82, [5, 6, 4] [4, 4, 5]Dynamic Graph Representation Learning via Self-Attention Networks
- 5.00, 0.00, [5, 5, 5] [4, 4, 3]Inducing Cooperation via Learning to reshape rewards in semi-cooperative multi-agent reinforcement learning
- 5.00, 0.00, [5, 5, 5] [5, 5, 4]VHEGAN: Variational Hetero-Encoder Randomized GAN for Zero-Short Learning
- 5.00, 1.63, [3, 5, 7] [4, 3, 2]Noisy Information Bottlenecks for Generalization
- 5.00, 0.00, [5, 5, 5] [4, 5, 4]Learning Diverse Generations using Determinantal Point Processes
- 5.00, 0.00, [5, 5, 5] [4, 3, 4]Learning Representations of Categorical Feature Combinations via Self-Attention
- 5.00, 0.82, [4, 6, 5] [4, 4, 5]MLPrune: Multi-Layer Pruning for Automated Neural Network Compression
- 5.00, 1.41, [4, 4, 7] [4, 4, 4]Zero-shot Learning for Speech Recognition with Universal Phonetic Model
- 5.00, 0.82, [4, 5, 6] [4, 5, 2]Reinforced Imitation Learning from Observations
- 5.00, 0.82, [4, 5, 6] [5, 4, 2]Link Prediction in Hypergraphs using Graph Convolutional Networks
- 5.00, 0.82, [4, 6, 5] [5, 3, 4]Structured Content Preservation for Unsupervised Text Style Transfer
- 5.00, 0.00, [5, 5, 5] [2, 3, 5]Riemannian TransE: Multi-relational Graph Embedding in Non-Euclidean Space
- 5.00, 0.82, [6, 4, 5] [2, 4, 3]On Regularization and Robustness of Deep Neural Networks
- 5.00, 0.82, [6, 5, 4] [3, 3, 3]Scalable Neural Theorem Proving on Knowledge Bases and Natural Language
- 5.00, 2.16, [8, 3, 4] [5, 5, 5]Learning to remember: Dynamic Generative Memory for Continual Learning
- 5.00, 0.82, [6, 4, 5] [4, 4, 4]A Frank-Wolfe Framework for Efficient and Effective Adversarial Attacks
- 5.00, 0.82, [5, 4, 6] [5, 4, 3]Human-Guided Column Networks: Augmenting Deep Learning with Advice
- 5.00, 0.82, [4, 6, 5] [5, 2, 4]Double Neural Counterfactual Regret Minimization
- 5.00, 0.82, [4, 5, 6] [3, 3, 4]Transferring SLU Models in Novel Domains
- 5.00, 2.16, [3, 4, 8] [5, 5, 4]Analysis of Memory Organization for Dynamic Neural Networks
- 5.00, 0.82, [6, 5, 4] [5, 3, 4]Systematic Generalization: What Is Required and Can It Be Learned?
- 5.00, 0.82, [5, 6, 4] [4, 4, 4]Context Mover’s Distance & Barycenters: Optimal transport of contexts for building representations
- 5.00, 1.22, [6, 5, 6, 3] [3, 3, 3, 5]Mixed Precision Quantization of ConvNets via Differentiable Neural Architecture Search
- 5.00, 0.82, [4, 6, 5] [5, 4, 4]Successor Options : An Option Discovery Algorithm for Reinforcement Learning
- 5.00, 0.82, [4, 5, 6] [4, 3, 5]STCN: Stochastic Temporal Convolutional Networks
- 5.00, 0.82, [6, 4, 5] [4, 5, 4]Analyzing Federated Learning through an Adversarial Lens
- 5.00, 1.22, [7, 4, 4, 5] [4, 4, 3, 4]Causal Reasoning from Meta-learning
- 5.00, 0.82, [6, 4, 5] [4, 3, 4]AD-VAT: An Asymmetric Dueling mechanism for learning Visual Active Tracking
- 5.00, 0.00, [5, 5, 5] [4, 3, 5]Incremental Few-Shot Learning with Attention Attractor Networks
- 5.00, 0.82, [6, 5, 4] [4, 4, 3]GenEval: A Benchmark Suite for Evaluating Generative Models
- 5.00, 0.82, [4, 5, 6] [5, 5, 3]Approximation capability of neural networks on sets of probability measures and tree-structured data
- 5.00, 0.82, [4, 6, 5] [3, 3, 4]Robustness Certification with Refinement
- 5.00, 0.82, [6, 4, 5] [3, 5, 3]Intrinsic Social Motivation via Causal Influence in Multi-Agent RL
- 5.00, 0.00, [5, 5, 5] [4, 4, 4]Making Convolutional Networks Shift-Invariant Again
- 5.00, 0.82, [6, 5, 4] [4, 4, 4]Adversarial Audio Super-Resolution with Unsupervised Feature Losses
- 5.00, 1.63, [3, 7, 5] [4, 5, 4]ACTRCE: Augmenting Experience via Teacher’s Advice
- 5.00, 0.00, [5, 5, 5] [3, 4, 3]Learnable Embedding Space for Efficient Neural Architecture Compression
- 5.00, 1.41, [4, 7, 4] [5, 4, 3]ISA-VAE: Independent Subspace Analysis with Variational Autoencoders
- 5.00, 0.82, [6, 5, 4] [3, 3, 4]Interpretable Continual Learning
- 5.00, 0.00, [5, 5, 5] [5, 4, 5]Experience replay for continual learning
- 5.00, 0.82, [6, 5, 4] [4, 3, 4]Accelerated Gradient Flow for Probability Distributions
- 5.00, 0.00, [5, 5, 5] [3, 3, 3]Learning to Progressively Plan
- 5.00, 0.82, [5, 4, 6] [4, 4, 4]Capsules Graph Neural Network
- 5.00, 0.82, [5, 4, 6] [4, 4, 5]Unsupervised Multi-Target Domain Adaptation: An Information Theoretic Approach
- 5.00, 0.82, [5, 6, 4] [3, 4, 3]Exploiting Cross-Lingual Subword Similarities in Low-Resource Document Classification
- 5.00, 0.82, [5, 6, 4] [2, 4, 4]Graph2Graph Networks for Multi-Label Classification
- 5.00, 1.41, [3, 6, 6] [4, 4, 4]Towards GAN Benchmarks Which Require Generalization
- 5.00, 0.71, [5, 6, 4, 5] [5, 3, 5, 5]TTS-GAN: a generative adversarial network for style modeling in a text-to-speech system
- 5.00, 1.22, [3, 6, 5, 6] [4, 3, 4, 3]A Better Baseline for Second Order Gradient Estimation in Stochastic Computation Graphs
- 5.00, 0.82, [5, 4, 6] [5, 4, 4]Local Image-to-Image Translation via Pixel-wise Highway Adaptive Instance Normalization
- 5.00, 0.82, [4, 6, 5] [4, 4, 5]INFORMATION MAXIMIZATION AUTO-ENCODING
- 5.00, 0.82, [4, 6, 5] [4, 5, 5]Generative Adversarial Self-Imitation Learning
- 5.00, 1.41, [7, 4, 4] [3, 3, 3]Generative Adversarial Models for Learning Private and Fair Representations
- 4.80, 1.17, [6, 3, 4, 6, 5] [5, 4, 4, 2, 4]Deep Neuroevolution: Genetic Algorithms are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning
- 4.75, 0.83, [6, 4, 4, 5] [3, 2, 3, 3]Cutting Down Training Memory by Re-fowarding
- 4.75, 0.83, [5, 6, 4, 4] [5, 4, 4, 4]Multi-turn Dialogue Response Generation in an Adversarial Learning Framework
- 4.75, 0.43, [5, 5, 4, 5] [2, 2, 4, 5]Pooling Is Neither Necessary nor Sufficient for Appropriate Deformation Stability in CNNs
- 4.75, 1.92, [8, 3, 4, 4] [2, 5, 5, 4]Geomstats: a Python Package for Riemannian Geometry in Machine Learning
- 4.75, 1.48, [4, 3, 7, 5] [3, 4, 4, 4]Towards a better understanding of Vector Quantized Autoencoders
- 4.67, 0.47, [4, 5, 5] [2, 3, 3]Deep Curiosity Search: Intra-Life Exploration Can Improve Performance on Challenging Deep Reinforcement Learning Problems
- 4.67, 1.70, [7, 3, 4] [3, 5, 4]CHEMICAL NAMES STANDARDIZATION USING NEURAL SEQUENCE TO SEQUENCE MODEL
- 4.67, 0.94, [6, 4, 4] [5, 1, 4]Traditional and Heavy Tailed Self Regularization in Neural Network Models
- 4.67, 0.47, [4, 5, 5] [3, 2, 4]Count-Based Exploration with the Successor Representation
- 4.67, 0.47, [5, 5, 4] [3, 4, 4]Learning Graph Representations by Dendrograms
- 4.67, 0.47, [5, 4, 5] [2, 3, 4]Efficient Dictionary Learning with Gradient Descent
- 4.67, 1.25, [3, 6, 5] [5, 2, 5]$A^*$ sampling with probability matching
- 4.67, 0.47, [4, 5, 5] [3, 5, 3]Neural Variational Inference For Embedding Knowledge Graphs
- 4.67, 0.94, [4, 4, 6] [4, 4, 4]SupportNet: solving catastrophic forgetting in class incremental learning with support data
- 4.67, 0.94, [4, 6, 4] [4, 5, 4]Unsupervised Image to Sequence Translation with Canvas-Drawer Networks
- 4.67, 1.70, [7, 3, 4] [4, 5, 3]Unsupervised Word Discovery with Segmental Neural Language Models
- 4.67, 1.25, [6, 3, 5] [4, 5, 4]Generative Adversarial Network Training is a Continual Learning Problem
- 4.67, 1.70, [7, 4, 3] [4, 3, 4]GENERALIZED ADAPTIVE MOMENT ESTIMATION
- 4.67, 0.47, [5, 5, 4] [5, 3, 3]Effective and Efficient Batch Normalization Using Few Uncorrelated Data for Statistics’ Estimation
- 4.67, 0.94, [4, 6, 4] [4, 5, 4]TequilaGAN: How To Easily Identify GAN Samples
- 4.67, 0.94, [6, 4, 4] [4, 4, 3]Gradient Descent Happens in a Tiny Subspace
- 4.67, 1.25, [5, 6, 3] [4, 4, 4]Dual Skew Divergence Loss for Neural Machine Translation
- 4.67, 0.47, [4, 5, 5] [3, 4, 3]Stochastic Learning of Additive Second-Order Penalties with Applications to Fairness
- 4.67, 0.94, [6, 4, 4] [5, 4, 4]Like What You Like: Knowledge Distill via Neuron Selectivity Transfer
- 4.67, 0.94, [4, 4, 6] [4, 4, 4]Boosting Trust Region Policy Optimization by Normalizing flows Policy
- 4.67, 0.47, [4, 5, 5] [4, 3, 4]Backplay: ‘Man muss immer umkehren’
- 4.67, 0.94, [4, 4, 6] [4, 4, 4]HIGHLY EFFICIENT 8-BIT LOW PRECISION INFERENCE OF CONVOLUTIONAL NEURAL NETWORKS
- 4.67, 0.94, [6, 4, 4] [4, 3, 4]Improved resistance of neural networks to adversarial images through generative pre-training
- 4.67, 0.47, [4, 5, 5] [4, 5, 3]Context-aware Forecasting for Multivariate Stationary Time-series
- 4.67, 0.94, [4, 6, 4] [4, 4, 5]Selective Self-Training for semi-supervised Learning
- 4.67, 0.94, [4, 4, 6] [5, 3, 4]Learning with Little Data: Evaluation of Deep Learning Algorithms
- 4.67, 1.70, [7, 3, 4] [5, 4, 4]What a difference a pixel makes: An empirical examination of features used by CNNs for categorisation
- 4.67, 0.94, [6, 4, 4] [3, 4, 4]Improving latent variable descriptiveness by modelling rather than ad-hoc factors
- 4.67, 0.47, [5, 5, 4] [4, 3, 4]Conditional Network Embeddings
- 4.67, 1.70, [7, 3, 4] [3, 4, 3]Holographic and other Point Set Distances for Machine Learning
- 4.67, 0.94, [6, 4, 4] [3, 3, 4]Unsupervised Emergence of Spatial Structure from Sensorimotor Prediction
- 4.67, 0.47, [4, 5, 5] [5, 4, 4]PRUNING IN TRAINING: LEARNING AND RANKING SPARSE CONNECTIONS IN DEEP CONVOLUTIONAL NETWORKS
- 4.67, 0.47, [4, 5, 5] [4, 5, 4]RelWalk — A Latent Variable Model Approach to Knowledge Graph Embedding
- 4.67, 0.47, [5, 5, 4] [2, 3, 4]Unsupervised Expectation Learning for Multisensory Binding
- 4.67, 1.25, [6, 5, 3] [4, 4, 4]Sentence Encoding with Tree-Constrained Relation Networks
- 4.67, 0.47, [4, 5, 5] [3, 2, 3]Pushing the bounds of dropout
- 4.67, 0.94, [4, 6, 4] [4, 5, 4]StrokeNet: A Neural Painting Environment
- 4.67, 1.25, [5, 6, 3] [2, 2, 4]Intriguing Properties of Learned Representations
- 4.67, 1.25, [5, 3, 6] [4, 4, 4]Sparse Binary Compression: Towards Distributed Deep Learning with minimal Communication
- 4.67, 0.47, [5, 5, 4] [4, 4, 4]Computation-Efficient Quantization Method for Deep Neural Networks
- 4.67, 1.25, [3, 5, 6] [4, 5, 3]Pumpout: A Meta Approach for Robustly Training Deep Neural Networks with Noisy Labels
- 4.67, 0.47, [5, 5, 4] [4, 3, 4]Consistency-based anomaly detection with adaptive multiple-hypotheses predictions
- 4.67, 1.25, [5, 6, 3] [2, 4, 5]Integrated Steganography and Steganalysis with Generative Adversarial Networks
- 4.67, 0.47, [4, 5, 5] [4, 4, 5]Rectified Gradient: Layer-wise Thresholding for Sharp and Coherent Attribution Maps
- 4.67, 0.94, [6, 4, 4] [5, 4, 4]Generative replay with feedback connections as a general strategy for continual learning
- 4.67, 1.25, [6, 5, 3] [2, 3, 4]Double Viterbi: Weight Encoding for High Compression Ratio and Fast On-Chip Reconstruction for Deep Neural Network
- 4.67, 0.94, [6, 4, 4] [5, 3, 4]Effective Path: Know the Unknowns of Neural Network
- 4.67, 1.25, [3, 6, 5] [4, 4, 4]Siamese Capsule Networks
- 4.67, 0.47, [4, 5, 5] [5, 3, 4]Ergodic Measure Preserving Flows
- 4.67, 1.25, [5, 3, 6] [5, 5, 4]3D-RelNet: Joint Object and Relational Network for 3D Prediction
- 4.67, 0.47, [5, 5, 4] [4, 5, 4]Finding Mixed Nash Equilibria of Generative Adversarial Networks
- 4.67, 0.47, [5, 4, 5] [5, 5, 4]Investigating CNNs’ Learning Representation under label noise
- 4.67, 0.94, [4, 6, 4] [5, 4, 4]Conscious Inference for Object Detection
- 4.67, 0.47, [5, 4, 5] [4, 2, 3]Learning Information Propagation in the Dynamical Systems via Information Bottleneck Hierarchy
- 4.67, 0.47, [5, 4, 5] [2, 5, 4]TabNN: A Universal Neural Network Solution for Tabular Data
- 4.67, 1.25, [3, 5, 6] [3, 2, 4]Probabilistic Binary Neural Networks
- 4.67, 0.47, [5, 4, 5] [5, 4, 3]Gradient-based learning for F-measure and other performance metrics
- 4.67, 0.47, [4, 5, 5] [4, 5, 2]SEGEN: SAMPLE-ENSEMBLE GENETIC EVOLUTIONARY NETWORK MODEL
- 4.67, 0.94, [6, 4, 4] [4, 4, 4]Parameter efficient training of deep convolutional neural networks by dynamic sparse reparameterization
- 4.67, 1.25, [5, 6, 3] [4, 4, 4]Learning to Drive by Observing the Best and Synthesizing the Worst
- 4.67, 1.89, [6, 2, 6] [5, 5, 3]Learning to Adapt in Dynamic, Real-World Environments through Meta-Reinforcement Learning
- 4.67, 1.25, [3, 6, 5] [3, 4, 3]MARGINALIZED AVERAGE ATTENTIONAL NETWORK FOR WEAKLY-SUPERVISED LEARNING
- 4.67, 1.70, [3, 7, 4] [3, 5, 4]Discriminative out-of-distribution detection for semantic segmentation
- 4.67, 0.47, [5, 5, 4] [4, 4, 3]Integral Pruning on Activations and Weights for Efficient Neural Networks
- 4.67, 0.47, [4, 5, 5] [4, 4, 4]Online Bellman Residue Minimization via Saddle Point Optimization
- 4.67, 0.47, [5, 4, 5] [4, 5, 4]Area Attention
- 4.67, 0.47, [5, 4, 5] [2, 3, 2]NEURAL MALWARE CONTROL WITH DEEP REINFORCEMENT LEARNING
- 4.67, 0.47, [5, 5, 4] [4, 5, 4]Variational Sparse Coding
- 4.67, 0.94, [6, 4, 4] [5, 3, 4]What Information Does a ResNet Compress?
- 4.67, 1.25, [5, 6, 3] [5, 4, 5]Interpreting Adversarial Robustness: A View from Decision Surface in Input Space
- 4.67, 0.47, [5, 5, 4] [4, 3, 4]LIT: Block-wise Intermediate Representation Training for Model Compression
- 4.67, 0.47, [5, 4, 5] [5, 4, 4]An Energy-Based Framework for Arbitrary Label Noise Correction
- 4.67, 0.94, [6, 4, 4] [1, 3, 2]ACE: Artificial Checkerboard Enhancer to Induce and Evade Adversarial Attacks
- 4.67, 0.47, [4, 5, 5] [4, 3, 4]SOLAR: Deep Structured Representations for Model-Based Reinforcement Learning
- 4.67, 0.94, [6, 4, 4] [4, 5, 4]Differentiable Expected BLEU for Text Generation
- 4.67, 0.47, [4, 5, 5] [4, 4, 4]Learning Joint Wasserstein Auto-Encoders for Joint Distribution Matching
- 4.67, 1.25, [6, 3, 5] [4, 4, 3]Exploiting Environmental Variation to Improve Policy Robustness in Reinforcement Learning
- 4.67, 0.47, [4, 5, 5] [4, 3, 4]Sufficient Conditions for Robustness to Adversarial Examples: a Theoretical and Empirical Study with Bayesian Neural Networks
- 4.67, 0.47, [4, 5, 5] [5, 4, 3]Entropic GANs meet VAEs: A Statistical Approach to Compute Sample Likelihoods in GANs
- 4.67, 0.47, [5, 5, 4] [4, 4, 3]PAIRWISE AUGMENTED GANS WITH ADVERSARIAL RECONSTRUCTION LOSS
- 4.67, 0.47, [5, 5, 4] [4, 3, 5]Learned optimizers that outperform on wall-clock and validation loss
- 4.67, 0.94, [4, 6, 4] [4, 4, 5]Stability of Stochastic Gradient Method with Momentum for Strongly Convex Loss Functions
- 4.67, 0.47, [5, 4, 5] [4, 3, 5]When Will Gradient Methods Converge to Max-margin Classifier under ReLU Models?
- 4.67, 0.47, [5, 4, 5] [4, 5, 3]Convergence Guarantees for RMSProp and ADAM in Non-Convex Optimization and an Empirical Comparison to Nesterov Acceleration
- 4.67, 0.94, [4, 6, 4] [5, 4, 4]Geometry aware convolutional filters for omnidirectional images representation
- 4.67, 0.94, [6, 4, 4] [2, 3, 5]FEATURE PRIORITIZATION AND REGULARIZATION IMPROVE STANDARD ACCURACY AND ADVERSARIAL ROBUSTNESS
- 4.67, 0.47, [4, 5, 5] [4, 5, 3]Learning Gibbs-regularized GANs with variational discriminator reparameterization
- 4.67, 0.47, [5, 4, 5] [4, 4, 2]Neural separation of observed and unobserved distributions
- 4.67, 0.47, [5, 4, 5] [3, 3, 3]Penetrating the Fog: the Path to Efficient CNN Models
- 4.67, 0.94, [4, 4, 6] [4, 3, 4]Expressiveness in Deep Reinforcement Learning
- 4.67, 0.47, [4, 5, 5] [5, 4, 4]Generating Realistic Stock Market Order Streams
- 4.67, 0.47, [4, 5, 5] [5, 3, 4]An investigation of model-free planning
- 4.67, 1.25, [3, 6, 5] [5, 3, 3]Selectivity metrics can overestimate the selectivity of units: a case study on AlexNet
- 4.67, 0.47, [5, 5, 4] [4, 3, 4]CNNSAT: Fast, Accurate Boolean Satisfiability using Convolutional Neural Networks
- 4.67, 0.47, [5, 5, 4] [3, 5, 5]Unifying Bilateral Filtering and Adversarial Training for Robust Neural Networks
- 4.67, 0.94, [6, 4, 4] [4, 4, 4]Sliced Wasserstein Auto-Encoders
- 4.67, 1.25, [5, 3, 6] [4, 3, 5]End-to-end learning of pharmacological assays from high-resolution microscopy images
- 4.67, 1.25, [3, 5, 6] [4, 3, 4]Safe Policy Learning from Observations
- 4.67, 0.94, [4, 4, 6] [4, 4, 3]A Study of Robustness of Neural Nets Using Approximate Feature Collisions
- 4.67, 0.47, [5, 5, 4] [4, 3, 3]SSoC: Learning Spontaneous and Self-Organizing Communication for Multi-Agent Collaboration
- 4.67, 1.25, [6, 3, 5] [4, 4, 3]On the Geometry of Adversarial Examples
- 4.67, 0.47, [4, 5, 5] [4, 3, 3]Neural Networks with Structural Resistance to Adversarial Attacks
- 4.67, 0.47, [5, 4, 5] [4, 4, 4]Partially Mutual Exclusive Softmax for Positive and Unlabeled data
- 4.67, 1.25, [3, 5, 6] [4, 4, 4]Unsupervised Disentangling Structure and Appearance
- 4.67, 0.47, [4, 5, 5] [4, 4, 4]Success at any cost: value constrained model-free continuous control
- 4.67, 0.47, [5, 4, 5] [4, 4, 3]Predictive Uncertainty through Quantization
- 4.67, 0.94, [6, 4, 4] [4, 5, 5]Maximum a Posteriori on a Submanifold: a General Image Restoration Method with GAN
- 4.67, 0.47, [5, 4, 5] [4, 4, 4]Zero-training Sentence Embedding via Orthogonal Basis
- 4.67, 0.47, [5, 4, 5] [4, 4, 4]The Expressive Power of Gated Recurrent Units as a Continuous Dynamical System
- 4.67, 0.94, [4, 4, 6] [4, 5, 3]SIMILE: Introducing Sequential Information towards More Effective Imitation Learning
- 4.67, 2.05, [7, 2, 5] [4, 5, 3]Meta-learning with differentiable closed-form solvers
- 4.67, 1.25, [3, 6, 5] [4, 4, 4]Mode Normalization
- 4.67, 0.94, [4, 6, 4] [2, 4, 4]Security Analysis of Deep Neural Networks Operating in the Presence of Cache Side-Channel Attacks
- 4.67, 1.25, [3, 5, 6] [4, 3, 4]NSGA-Net: A Multi-Objective Genetic Algorithm for Neural Architecture Search
- 4.67, 1.70, [4, 7, 3] [3, 4, 4]A theoretical framework for deep and locally connected ReLU network
- 4.67, 0.94, [4, 6, 4] [3, 3, 4]Approximation and non-parametric estimation of ResNet-type convolutional neural networks via block-sparse fully-connected neural networks
- 4.67, 0.47, [5, 5, 4] [4, 3, 5]Expanding the Reach of Federated Learning by Reducing Client Resource Requirements
- 4.67, 1.25, [3, 6, 5] [1, 4, 4]Pix2Scene: Learning Implicit 3D Representations from Images
- 4.67, 0.94, [4, 4, 6] [2, 5, 3]A Proposed Hierarchy of Deep Learning Tasks
- 4.67, 0.47, [5, 4, 5] [5, 5, 4]CGNF: Conditional Graph Neural Fields
- 4.67, 0.94, [6, 4, 4] [4, 4, 3]Self-Supervised Generalisation with Meta Auxiliary Learning
- 4.67, 0.47, [4, 5, 5] [4, 4, 2]Theoretical and Empirical Study of Adversarial Examples
- 4.67, 1.70, [4, 3, 7] [4, 4, 3]Coupled Recurrent Models for Polyphonic Music Composition
- 4.67, 0.94, [4, 6, 4] [3, 3, 4]DEEP-TRIM: REVISITING L1 REGULARIZATION FOR CONNECTION PRUNING OF DEEP NETWORK
- 4.67, 0.47, [5, 4, 5] [2, 4, 3]Transfer Value or Policy? A Value-centric Framework Towards Transferrable Continuous Reinforcement Learning
- 4.67, 0.47, [5, 5, 4] [4, 4, 4]Model Compression with Generative Adversarial Networks
- 4.67, 1.25, [6, 5, 3] [4, 4, 4]Text Infilling
- 4.67, 1.25, [6, 3, 5] [4, 3, 4]Visual Imitation with a Minimal Adversary
- 4.67, 1.25, [6, 3, 5] [3, 3, 3]Novel positional encodings to enable tree-structured transformers
- 4.67, 0.47, [5, 4, 5] [4, 4, 4]Shaping representations through communication
- 4.67, 0.47, [5, 4, 5] [3, 3, 4]Characterizing Vulnerabilities of Deep Reinforcement Learning
- 4.67, 0.47, [4, 5, 5] [4, 3, 4]Multi-Grained Entity Proposal Network for Named Entity Recognition
- 4.67, 0.47, [5, 5, 4] [3, 4, 4]Measuring Density and Similarity of Task Relevant Information in Neural Representations
- 4.67, 0.47, [5, 5, 4] [4, 3, 4]Outlier Detection from Image Data
- 4.67, 0.47, [5, 5, 4] [4, 3, 5]Accelerated Sparse Recovery Under Structured Measurements
- 4.67, 0.94, [6, 4, 4] [3, 3, 4]Object-Oriented Model Learning through Multi-Level Abstraction
- 4.67, 1.70, [3, 7, 4] [4, 3, 3]Learning to control self-assembling morphologies: a study of generalization via modularity
- 4.67, 0.47, [5, 5, 4] [4, 4, 4]Using GANs for Generation of Realistic City-Scale Ride Sharing/Hailing Data Sets
- 4.67, 0.47, [4, 5, 5] [3, 4, 4]Manifold Alignment via Feature Correspondence
- 4.67, 1.70, [3, 4, 7] [4, 4, 3]Explicit Recall for Efficient Exploration
- 4.67, 0.47, [5, 4, 5] [4, 4, 3]Differential Equation Networks
- 4.67, 0.47, [5, 4, 5] [4, 4, 4]Predicting the Present and Future States of Multi-agent Systems from Partially-observed Visual Data
- 4.67, 0.47, [5, 5, 4] [4, 4, 5]Learning shared manifold representation of images and attributes for generalized zero-shot learning
- 4.67, 0.47, [5, 4, 5] [3, 5, 4]Inference of unobserved event streams with neural Hawkes particle smoothing
- 4.50, 0.50, [5, 4] [3, 3]Improving On-policy Learning with Statistical Reward Accumulation
- 4.50, 0.50, [5, 4, 4, 5] [2, 3, 2, 2]Unification of Recurrent Neural Network Architectures and Quantum Inspired Stable Design
- 4.50, 0.50, [5, 5, 4, 4] [3, 3, 4, 4]One-Shot High-Fidelity Imitation: Training Large-Scale Deep Nets with RL
- 4.50, 0.50, [4, 5] [4, 4]Fast Exploration with Simplified Models and Approximately Optimistic Planning in Model Based Reinforcement Learning
- 4.50, 0.50, [5, 4, 4, 5] [3, 4, 4, 4]Music Transformer
- 4.40, 0.80, [6, 4, 4, 4, 4] [4, 5, 4, 3, 5]Context Dependent Modulation of Activation Function
- 4.33, 0.47, [5, 4, 4] [4, 4, 2]Unsupervised classification into unknown k classes
- 4.33, 0.47, [4, 4, 5] [4, 5, 4]Adaptive Convolutional ReLUs
- 4.33, 0.47, [4, 4, 5] [4, 3, 3]FEED: Feature-level Ensemble Effect for knowledge Distillation
- 4.33, 1.89, [3, 3, 7] [4, 3, 3]Deep Perm-Set Net: Learn to predict sets with unknown permutation and cardinality using deep neural networks
- 4.33, 1.25, [4, 6, 3] [3, 2, 4]Variation Network: Learning High-level Attributes for Controlled Input Manipulation
- 4.33, 0.94, [5, 3, 5] [3, 5, 4]Discovering Low-Precision Networks Close to Full-Precision Networks for Efficient Embedded Inference
- 4.33, 1.25, [3, 6, 4] [4, 4, 3]Targeted Adversarial Examples for Black Box Audio Systems
- 4.33, 0.47, [4, 4, 5] [4, 4, 4]Neuron Hierarchical Networks
- 4.33, 1.70, [6, 5, 2] [5, 4, 5]Online Learning for Supervised Dimension Reduction
- 4.33, 0.47, [4, 5, 4] [4, 4, 4]Opportunistic Learning: Budgeted Cost-Sensitive Learning from Data Streams
- 4.33, 0.47, [4, 4, 5] [4, 4, 3]MANIFOLDNET: A DEEP NEURAL NETWORK FOR MANIFOLD-VALUED DATA
- 4.33, 1.25, [4, 6, 3] [2, 3, 4]Unsupervised Meta-Learning for Reinforcement Learning
- 4.33, 1.70, [5, 2, 6] [3, 5, 3]q-Neurons: Neuron Activations based on Stochastic Jackson’s Derivative Operators
- 4.33, 0.47, [5, 4, 4] [4, 4, 4]Learning a Neural-network-based Representation for Open Set Recognition
- 4.33, 0.47, [4, 4, 5] [4, 4, 4]No Pressure! Addressing Problem of Local Minima in Manifold Learning
- 4.33, 1.25, [3, 4, 6] [5, 3, 3]On the Convergence and Robustness of Batch Normalization
- 4.33, 0.47, [4, 5, 4] [4, 5, 4]Sample Efficient Deep Neuroevolution in Low Dimensional Latent Space
- 4.33, 1.25, [6, 3, 4] [3, 5, 4]Context-adaptive Entropy Model for End-to-end Optimized Image Compression
- 4.33, 1.25, [4, 3, 6] [4, 4, 3]An Adversarial Learning Framework for a Persona-based Multi-turn Dialogue Model
- 4.33, 0.47, [4, 4, 5] [4, 4, 4]ODIN: Outlier Detection In Neural Networks
- 4.33, 0.47, [5, 4, 4] [4, 4, 4]Log Hyperbolic Cosine Loss Improves Variational Auto-Encoder
- 4.33, 0.94, [5, 3, 5] [4, 4, 4]Hierarchical Reinforcement Learning via Advantage-Weighted Information Maximization
- 4.33, 0.47, [5, 4, 4] [3, 5, 3]A preconditioned accelerated stochastic gradient descent algorithm
- 4.33, 0.47, [4, 4, 5] [4, 3, 3]Local Stability and Performance of Simple Gradient Penalty $\mu$-Wasserstein GAN
- 4.33, 0.47, [5, 4, 4] [3, 4, 4]Efficient Convolutional Neural Network Training with Direct Feedback Alignment
- 4.33, 1.25, [3, 4, 6] [5, 5, 2]LEARNING ADVERSARIAL EXAMPLES WITH RIEMANNIAN GEOMETRY
- 4.33, 0.47, [4, 5, 4] [5, 3, 5]SHAMANN: Shared Memory Augmented Neural Networks
- 4.33, 0.47, [4, 4, 5] [4, 3, 3]Adaptive Convolutional Neural Networks
- 4.33, 1.25, [3, 6, 4] [4, 3, 3]Pixel Redrawn For A Robust Adversarial Defense
- 4.33, 0.47, [4, 4, 5] [4, 3, 3]DeepTwist: Learning Model Compression via Occasional Weight Distortion
- 4.33, 1.25, [4, 6, 3] [3, 3, 5]Wasserstein proximal of GANs
- 4.33, 1.25, [4, 3, 6] [4, 4, 2]Augmented Cyclic Adversarial Learning for Low Resource Domain Adaptation
- 4.33, 0.94, [3, 5, 5] [5, 2, 3]Exploration by Uncertainty in Reward Space
- 4.33, 0.47, [4, 5, 4] [4, 4, 5]Contextualized Role Interaction for Neural Machine Translation
- 4.33, 0.47, [4, 4, 5] [4, 4, 3]Escaping Flat Areas via Function-Preserving Structural Network Modifications
- 4.33, 0.47, [4, 5, 4] [4, 3, 4]DVOLVER: Efficient Pareto-Optimal Neural Network Architecture Search
- 4.33, 0.47, [4, 5, 4] [4, 4, 3]Classifier-agnostic saliency map extraction
- 4.33, 0.47, [4, 5, 4] [4, 4, 3]PRUNING WITH HINTS: AN EFFICIENT FRAMEWORK FOR MODEL ACCELERATION
- 4.33, 0.94, [3, 5, 5] [3, 4, 4]Meta-Learning with Individualized Feature Space for Few-Shot Classification
- 4.33, 0.94, [5, 5, 3] [2, 2, 3]Downsampling leads to Image Memorization in Convolutional Autoencoders
- 4.33, 1.25, [3, 6, 4] [4, 5, 3]FAST OBJECT LOCALIZATION VIA SENSITIVITY ANALYSIS
- 4.33, 0.47, [4, 5, 4] [4, 3, 4]Generative Models from the perspective of Continual Learning
- 4.33, 0.47, [4, 5, 4] [3, 5, 5]Total Style Transfer with a Single Feed-Forward Network
- 4.33, 0.94, [3, 5, 5] [4, 5, 5]A fast quasi-Newton-type method for large-scale stochastic optimisation
- 4.33, 0.94, [5, 5, 3] [4, 4, 4]Explainable Adversarial Learning: Implicit Generative Modeling of Random Noise during Training for Adversarial Robustness
- 4.33, 0.47, [5, 4, 4] [4, 5, 5]Universal Attacks on Equivariant Networks
- 4.33, 0.94, [5, 5, 3] [4, 4, 4]Compound Density Networks
- 4.33, 0.47, [4, 5, 4] [5, 2, 3]A Guider Network for Multi-Dual Learning
- 4.33, 0.94, [5, 5, 3] [3, 4, 4]ON BREIMAN’S DILEMMA IN NEURAL NETWORKS: SUCCESS AND FAILURE OF NORMALIZED MARGINS
- 4.33, 0.47, [4, 5, 4] [4, 3, 4]Recovering the Lowest Layer of Deep Networks with High Threshold Activations
- 4.33, 2.05, [2, 4, 7] [5, 3, 3]Mental Fatigue Monitoring using Brain Dynamics Preferences
- 4.33, 0.47, [4, 4, 5] [4, 4, 3]Progressive Weight Pruning Of Deep Neural Networks Using ADMM
- 4.33, 0.47, [4, 4, 5] [3, 4, 4]MixFeat: Mix Feature in Latent Space Learns Discriminative Space
- 4.33, 0.47, [4, 4, 5] [4, 4, 3]The Cakewalk Method
- 4.33, 1.25, [3, 6, 4] [4, 4, 4]On Generalization Bounds of a Family of Recurrent Neural Networks
- 4.33, 1.25, [6, 4, 3] [3, 4, 4]Auto-Encoding Knockoff Generator for FDR Controlled Variable Selection
- 4.33, 0.47, [4, 4, 5] [4, 4, 4]In Your Pace: Learning the Right Example at the Right Time
- 4.33, 0.94, [5, 3, 5] [3, 3, 3]Backdrop: Stochastic Backpropagation
- 4.33, 0.47, [5, 4, 4] [3, 5, 4]SENSE: SEMANTICALLY ENHANCED NODE SEQUENCE EMBEDDING
- 4.33, 0.47, [4, 5, 4] [5, 4, 5]Task-GAN for Improved GAN based Image Restoration
- 4.33, 0.47, [4, 4, 5] [5, 3, 4]EFFICIENT SEQUENCE LABELING WITH ACTOR-CRITIC TRAINING
- 4.33, 1.25, [4, 6, 3] [4, 4, 4]Robust Determinantal Generative Classifier for Noisy Labels and Adversarial Attacks
- 4.33, 0.47, [4, 4, 5] [4, 4, 3]Beyond Winning and Losing: Modeling Human Motivations and Behaviors with Vector-valued Inverse Reinforcement Learning
- 4.33, 0.47, [5, 4, 4] [3, 5, 3]Combining Learned Representations for Combinatorial Optimization
- 4.33, 0.47, [4, 4, 5] [4, 4, 4]From Nodes to Networks: Evolving Recurrent Neural Networks
- 4.33, 0.94, [5, 5, 3] [3, 4, 5]DppNet: Approximating Determinantal Point Processes with Deep Networks
- 4.33, 0.47, [5, 4, 4] [4, 4, 4]Implicit Maximum Likelihood Estimation
- 4.33, 0.47, [5, 4, 4] [4, 4, 4]Deep Ensemble Bayesian Active Learning : Adressing the Mode Collapse issue in Monte Carlo dropout via Ensembles
- 4.33, 0.47, [4, 4, 5] [5, 4, 4]Asynchronous SGD without gradient delay for efficient distributed training
- 4.33, 0.47, [4, 5, 4] [3, 3, 3]On the effect of the activation function on the distribution of hidden nodes in a deep network
- 4.33, 1.25, [3, 4, 6] [5, 4, 4]Learning Corresponded Rationales for Text Matching
- 4.33, 1.25, [4, 3, 6] [3, 3, 3]REPRESENTATION COMPRESSION AND GENERALIZATION IN DEEP NEURAL NETWORKS
- 4.33, 1.89, [3, 3, 7] [5, 4, 4]Meta-Learning to Guide Segmentation
- 4.33, 1.89, [7, 3, 3] [4, 4, 5]Recycling the discriminator for improving the inference mapping of GAN
- 4.33, 0.47, [5, 4, 4] [4, 4, 5]A Convergent Variant of the Boltzmann Softmax Operator in Reinforcement Learning
- 4.33, 1.25, [4, 6, 3] [4, 4, 4]Neural Probabilistic Motor Primitives for Humanoid Control
- 4.33, 1.70, [5, 2, 6] [3, 4, 3]Dual Learning: Theoretical Study and Algorithmic Extensions
- 4.33, 0.47, [5, 4, 4] [4, 3, 4]Visual Imitation Learning with Recurrent Siamese Networks
- 4.33, 0.47, [4, 5, 4] [5, 3, 3]Learning Hash Codes via Hamming Distance Targets
- 4.33, 0.94, [3, 5, 5] [5, 4, 3]Improving Sample-based Evaluation for Generative Adversarial Networks
- 4.33, 1.25, [6, 3, 4] [4, 5, 1]Bayesian Convolutional Neural Networks with Many Channels are Gaussian Processes
- 4.33, 0.47, [4, 5, 4] [5, 4, 3]Successor Uncertainties: exploration and uncertainty in temporal difference learning
- 4.33, 0.47, [4, 4, 5] [5, 3, 4]Jumpout: Improved Dropout for Deep Neural Networks with Rectified Linear Units
- 4.33, 0.47, [4, 4, 5] [5, 4, 4]Pseudosaccades: A simple ensemble scheme for improving classification performance of deep nets
- 4.33, 1.25, [3, 4, 6] [5, 5, 2]Modeling Dynamics of Biological Systems with Deep Generative Neural Networks
- 4.33, 0.47, [5, 4, 4] [5, 4, 5]A SINGLE SHOT PCA-DRIVEN ANALYSIS OF NETWORK STRUCTURE TO REMOVE REDUNDANCY
- 4.33, 0.47, [5, 4, 4] [4, 4, 4]Over-parameterization Improves Generalization in the XOR Detection Problem
- 4.33, 0.47, [4, 4, 5] [4, 4, 5]Learning What to Remember: Long-term Episodic Memory Networks for Learning from Streaming Data
- 4.33, 0.47, [4, 4, 5] [4, 3, 4]Rating Continuous Actions in Spatial Multi-Agent Problems
- 4.33, 0.47, [4, 5, 4] [3, 3, 4]Adversarial Examples Are a Natural Consequence of Test Error in Noise
- 4.33, 1.25, [4, 3, 6] [4, 5, 4]Where and when to look? Spatial-temporal attention for action recognition in videos
- 4.33, 0.47, [4, 5, 4] [4, 4, 5]LARGE BATCH SIZE TRAINING OF NEURAL NETWORKS WITH ADVERSARIAL TRAINING AND SECOND-ORDER INFORMATION
- 4.33, 1.25, [6, 3, 4] [5, 4, 1]Teaching to Teach by Structured Dark Knowledge
- 4.33, 0.94, [5, 5, 3] [4, 4, 3]Prototypical Examples in Deep Learning: Metrics, Characteristics, and Utility
- 4.33, 0.47, [4, 4, 5] [4, 5, 4]End-to-End Hierarchical Text Classification with Label Assignment Policy
- 4.33, 1.89, [3, 3, 7] [4, 4, 3]Structured Prediction using cGANs with Fusion Discriminator
- 4.33, 1.25, [6, 4, 3] [5, 4, 4]Open Vocabulary Learning on Source Code with a Graph-Structured Cache
- 4.33, 0.94, [3, 5, 5] [4, 3, 3]Modulated Variational Auto-Encoders for Many-to-Many Musical Timbre Transfer
- 4.33, 0.94, [5, 3, 5] [3, 5, 3]Variational recurrent models for representation learning
- 4.33, 0.94, [5, 3, 5] [4, 5, 4]Inter-BMV: Interpolation with Block Motion Vectors for Fast Semantic Segmentation on Video
- 4.33, 0.47, [4, 4, 5] [4, 4, 4]Do Language Models Have Common Sense?
- 4.33, 0.94, [5, 5, 3] [3, 4, 5]Model-Agnostic Meta-Learning for Multimodal Task Distributions
- 4.33, 0.47, [4, 5, 4] [4, 3, 4]How Training Data Affect the Accuracy and Robustness of Neural Networks for Image Classification
- 4.33, 1.25, [3, 6, 4] [5, 2, 5]Locally Linear Unsupervised Feature Selection
- 4.33, 0.47, [5, 4, 4] [4, 4, 3]SALSA-TEXT : SELF ATTENTIVE LATENT SPACE BASED ADVERSARIAL TEXT GENERATION
- 4.33, 0.47, [4, 5, 4] [5, 5, 5]Harmonic Unpaired Image-to-image Translation
- 4.33, 1.25, [6, 3, 4] [3, 4, 4]On Meaning-Preserving Adversarial Perturbations for Sequence-to-Sequence Models
- 4.33, 0.94, [5, 5, 3] [3, 4, 1]Meta-Learning Neural Bloom Filters
- 4.33, 0.47, [4, 4, 5] [4, 4, 3]BlackMarks: Black-box Multi-bit Watermarking for Deep Neural Networks
- 4.33, 1.25, [3, 6, 4] [5, 4, 4]Optimal Attacks against Multiple Classifiers
- 4.33, 0.47, [4, 4, 5] [4, 4, 2]Evolutionary-Neural Hybrid Agents for Architecture Search
- 4.33, 1.89, [7, 3, 3] [2, 4, 4]On Inductive Biases in Deep Reinforcement Learning
- 4.33, 1.25, [4, 3, 6] [3, 4, 3]W2GAN: RECOVERING AN OPTIMAL TRANSPORTMAP WITH A GAN
- 4.33, 0.94, [5, 5, 3] [2, 4, 4]Latent Transformations for Object View Points Synthesis
- 4.33, 1.25, [4, 3, 6] [3, 5, 4]Plan Online, Learn Offline: Efficient Learning and Exploration via Model-Based Control
- 4.33, 0.47, [4, 5, 4] [3, 3, 3]Learning to Control Visual Abstractions for Structured Exploration in Deep Reinforcement Learning
- 4.33, 0.94, [3, 5, 5] [4, 2, 4]Multi-Objective Value Iteration with Parameterized Threshold-Based Safety Constraints
- 4.33, 0.47, [5, 4, 4] [4, 2, 4]Select Via Proxy: Efficient Data Selection For Training Deep Networks
- 4.33, 0.47, [5, 4, 4] [3, 5, 3]Variational Domain Adaptation
- 4.33, 0.47, [4, 5, 4] [4, 5, 5]COMPOSITION AND DECOMPOSITION OF GANS
- 4.33, 0.94, [5, 5, 3] [4, 5, 4]PIE: Pseudo-Invertible Encoder
- 4.33, 0.47, [5, 4, 4] [4, 4, 2]TopicGAN: Unsupervised Text Generation from Explainable Latent Topics
- 4.33, 0.47, [4, 5, 4] [3, 3, 4]NICE: noise injection and clamping estimation for neural network quantization
- 4.33, 0.94, [5, 3, 5] [3, 5, 5]Network Reparameterization for Unseen Class Categorization
- 4.33, 0.94, [3, 5, 5] [4, 3, 3]Neural Rendering Model: Joint Generation and Prediction for Semi-Supervised Learning
- 4.33, 1.25, [4, 3, 6] [4, 3, 4]Architecture Compression
- 4.33, 1.89, [3, 7, 3] [3, 3, 2]A model cortical network for spatiotemporal sequence learning and prediction
- 4.33, 0.47, [4, 4, 5] [4, 3, 2]Modulating transfer between tasks in gradient-based meta-learning
- 4.33, 0.47, [4, 4, 5] [3, 4, 3]Mean Replacement Pruning
- 4.33, 0.47, [4, 5, 4] [4, 5, 3]Stochastic Quantized Activation: To prevent Overfitting in Fast Adversarial Training
- 4.33, 0.94, [5, 3, 5] [3, 4, 3]Provable Defenses against Spatially Transformed Adversarial Inputs: Impossibility and Possibility Results
- 4.33, 0.94, [5, 3, 5] [5, 3, 4]Learning Physics Priors for Deep Reinforcement Learing
- 4.33, 1.25, [3, 4, 6] [5, 4, 3]Looking inside the black box: assessing the modular structure of deep generative models with counterfactuals
- 4.33, 0.47, [5, 4, 4] [4, 4, 5]Correction Networks: Meta-Learning for Zero-Shot Learning
- 4.33, 0.94, [5, 5, 3] [2, 3, 5]Assessing Generalization in Deep Reinforcement Learning
- 4.33, 0.94, [5, 3, 5] [4, 3, 4]Bridging HMMs and RNNs through Architectural Transformations
- 4.33, 0.47, [4, 4, 5] [4, 2, 4]Variadic Learning by Bayesian Nonparametric Deep Embedding
- 4.25, 0.43, [5, 4, 4, 4] [3, 4, 5, 2]Characterizing the Accuracy/Complexity Landscape of Explanations of Deep Networks through Knowledge Extraction
- 4.25, 0.43, [5, 4, 4, 4] [3, 4, 3, 3]A Priori Estimates of the Generalization Error for Two-layer Neural Networks
- 4.25, 0.43, [5, 4, 4, 4] [3, 4, 4, 5]Countdown Regression: Sharp and Calibrated Survival Predictions
- 4.25, 1.48, [2, 4, 6, 5] [4, 3, 4, 3]Understanding the Asymptotic Performance of Model-Based RL Methods
- 4.25, 0.43, [4, 5, 4, 4] [4, 4, 3, 4]Unlabeled Disentangling of GANs with Guided Siamese Networks
- 4.25, 0.43, [4, 4, 4, 5] [4, 5, 5, 4]Discovering General-Purpose Active Learning Strategies
- 4.00, 0.00, [4, 4, 4] [5, 4, 4]Generative adversarial interpolative autoencoding: adversarial training on latent space interpolations encourages convex latent distributions
- 4.00, 0.82, [4, 3, 5] [4, 5, 5]The Forward-Backward Embedding of Directed Graphs
- 4.00, 1.41, [6, 3, 3] [4, 5, 3]Large-scale classification of structured objects using a CRF with deep class embedding
- 4.00, 0.00, [4, 4, 4] [5, 4, 4]Overcoming catastrophic forgetting through weight consolidation and long-term memory
- 4.00, 0.82, [4, 3, 5] [4, 4, 5]Neural Network Cost Landscapes as Quantum States
- 4.00, 0.82, [5, 3, 4] [4, 4, 5]Adversarial Attacks for Optical Flow-Based Action Recognition Classifiers
- 4.00, 0.82, [4, 3, 5] [3, 5, 2]Learning Latent Semantic Representation from Pre-defined Generative Model
- 4.00, 0.00, [4, 4, 4] [5, 4, 3]HC-Net: Memory-based Incremental Dual-Network System for Continual learning
- 4.00, 0.00, [4, 4, 4] [4, 4, 5]Sequence Modelling with Memory-Augmented Recurrent Neural Networks
- 4.00, 0.82, [3, 5, 4] [4, 3, 4]MERCI: A NEW METRIC TO EVALUATE THE CORRELATION BETWEEN PREDICTIVE UNCERTAINTY AND TRUE ERROR
- 4.00, 0.00, [4, 4] [1, 2]S-System, Geometry, Learning, and Optimization: A Theory of Neural Networks
- 4.00, 0.82, [3, 4, 5] [4, 3, 4]Difference-Seeking Generative Adversarial Network
- 4.00, 0.82, [5, 4, 3] [4, 4, 4]Semantic Parsing via Cross-Domain Schema
- 4.00, 0.82, [5, 4, 3] [4, 4, 5]On the Selection of Initialization and Activation Function for Deep Neural Networks
- 4.00, 0.00, [4, 4, 4] [3, 4, 3]Deep processing of structured data
- 4.00, 0.82, [3, 5, 4] [4, 4, 4]Better Accuracy with Quantified Privacy: Representations Learned via Reconstructive Adversarial Network
- 4.00, 0.82, [5, 4, 3] [3, 4, 3]Modular Deep Probabilistic Programming
- 4.00, 0.00, [4, 4, 4] [3, 5, 4]Dynamic Pricing on E-commerce Platform with Deep Reinforcement Learning
- 4.00, 0.82, [5, 4, 3] [4, 5, 4]A Multi-modal one-class generative adversarial network for anomaly detection in manufacturing
- 4.00, 0.82, [4, 3, 5] [4, 4, 4]Decoupling feature extraction from policy learning: assessing benefits of state representation learning in goal based robotics
- 4.00, 0.82, [5, 3, 4] [4, 4, 4]Assumption Questioning: Latent Copying and Reward Exploitation in Question Generation
- 4.00, 0.82, [4, 3, 5] [5, 4, 3]Polar Prototype Networks
- 4.00, 0.82, [3, 5, 4] [4, 4, 5]Applications of Gaussian Processes in Finance
- 4.00, 0.82, [5, 4, 3] [4, 4, 4]Incremental Hierarchical Reinforcement Learning with Multitask LMDPs
- 4.00, 0.82, [3, 4, 5] [3, 4, 3]On the Statistical and Information Theoretical Characteristics of DNN Representations
- 4.00, 0.00, [4, 4, 4] [4, 5, 4]Explaining Neural Networks Semantically and Quantitatively
- 4.00, 1.41, [6, 3, 3] [3, 3, 3]microGAN: Promoting Variety through Microbatch Discrimination
- 4.00, 0.82, [5, 3, 4] [4, 5, 2]PA-GAN: Improving GAN Training by Progressive Augmentation
- 4.00, 0.00, [4, 4, 4] [4, 3, 2]Deep Generative Models for learning Coherent Latent Representations from Multi-Modal Data
- 4.00, 0.82, [3, 5, 4] [4, 3, 4]Overfitting Detection of Deep Neural Networks without a Hold Out Set
- 4.00, 0.00, [4, 4, 4] [3, 4, 5]Mol-CycleGAN – a generative model for molecular optimization
- 4.00, 0.00, [4, 4, 4] [4, 4, 4]NUTS: Network for Unsupervised Telegraphic Summarization
- 4.00, 0.00, [4, 4, 4] [4, 4, 3]Sample-efficient policy learning in multi-agent Reinforcement Learning via meta-learning
- 4.00, 0.82, [4, 5, 3] [4, 3, 4]Few-shot Classification on Graphs with Structural Regularized GCNs
- 4.00, 0.82, [3, 5, 4] [5, 3, 5]Second-Order Adversarial Attack and Certifiable Robustness
- 4.00, 1.63, [4, 2, 6] [3, 5, 2]Reinforcement Learning: From temporal to spatial value decomposition
- 4.00, 0.00, [4, 4, 4] [4, 4, 5]EXPLORATION OF EFFICIENT ON-DEVICE ACOUSTIC MODELING WITH NEURAL NETWORKS
- 4.00, 1.41, [5, 2, 5] [5, 4, 4]The effectiveness of layer-by-layer training using the information bottleneck principle
- 4.00, 0.82, [3, 5, 4] [4, 3, 3]Layerwise Recurrent Autoencoder for General Real-world Traffic Flow Forecasting
- 4.00, 0.82, [3, 4, 5] [4, 4, 5]ON THE USE OF CONVOLUTIONAL AUTO-ENCODER FOR INCREMENTAL CLASSIFIER LEARNING IN CONTEXT AWARE ADVERTISEMENT
- 4.00, 0.00, [4, 4, 4] [4, 4, 4]ChainGAN: A sequential approach to GANs
- 4.00, 0.00, [4, 4, 4] [4, 5, 5]Activity Regularization for Continual Learning
- 4.00, 0.82, [5, 4, 3] [3, 4, 5]Robustness and Equivariance of Neural Networks
- 4.00, 0.82, [5, 4, 3] [4, 4, 3]Distributionally Robust Optimization Leads to Better Generalization: on SGD and Beyond
- 4.00, 0.82, [5, 3, 4] [4, 4, 4]D2KE: From Distance to Kernel and Embedding via Random Features For Structured Inputs
- 4.00, 0.00, [4, 4, 4] [4, 4, 5]Hyper-Regularization: An Adaptive Choice for the Learning Rate in Gradient Descent
- 4.00, 0.82, [4, 5, 3] [3, 5, 5]Complexity of Training ReLU Neural Networks
- 4.00, 1.41, [2, 4, 4, 6] [5, 4, 2, 2]Efficient Exploration through Bayesian Deep Q-Networks
- 4.00, 0.82, [4, 5, 3] [4, 5, 4]Sequenced-Replacement Sampling for Deep Learning
- 4.00, 0.00, [4, 4, 4] [4, 5, 5]DEEP ADVERSARIAL FORWARD MODEL
- 4.00, 0.82, [5, 4, 3] [3, 4, 4]Look Ma, No GANs! Image Transformation with ModifAE
- 4.00, 0.82, [4, 5, 3] [3, 3, 4]Morpho-MNIST: Quantitative Assessment and Diagnostics for Representation Learning
- 4.00, 0.82, [5, 4, 3] [4, 4, 5]ACIQ: Analytical Clipping for Integer Quantization of neural networks
- 4.00, 0.82, [5, 4, 3] [4, 3, 4]Constrained Bayesian Optimization for Automatic Chemical Design
- 4.00, 0.82, [3, 5, 4] [4, 3, 5]The wisdom of the crowd: reliable deep reinforcement learning through ensembles of Q-functions
- 4.00, 0.00, [4, 4, 4] [4, 3, 4]Co-manifold learning with missing data
- 4.00, 0.00, [4, 4] [5, 4]Fast Binary Functional Search on Graph
- 4.00, 0.82, [3, 4, 5] [4, 3, 4]Towards More Theoretically-Grounded Particle Optimization Sampling for Deep Learning
- 4.00, 0.00, [4, 4, 4] [4, 3, 4]Differentially Private Federated Learning: A Client Level Perspective
- 4.00, 0.82, [3, 5, 4] [4, 2, 4]UaiNets: From Unsupervised to Active Deep Anomaly Detection
- 4.00, 0.82, [5, 4, 3] [4, 4, 4]Guaranteed Recovery of One-Hidden-Layer Neural Networks via Cross Entropy
- 4.00, 0.82, [5, 3, 4] [3, 4, 3]In search of theoretically grounded pruning
- 4.00, 1.63, [2, 4, 6] [5, 3, 5]Label Smoothing and Logit Squeezing: A Replacement for Adversarial Training?
- 4.00, 0.82, [3, 4, 5] [5, 4, 4]Learning Representations in Model-Free Hierarchical Reinforcement Learning
- 4.00, 0.82, [5, 3, 4] [4, 5, 4]Dual Importance Weight GAN
- 4.00, 0.00, [4, 4, 4] [5, 5, 4]Relational Graph Attention Networks
- 4.00, 0.00, [4, 4, 4] [3, 4, 5]HyperGAN: Exploring the Manifold of Neural Networks
- 4.00, 0.82, [4, 3, 5] [5, 5, 3]Generalized Capsule Networks with Trainable Routing Procedure
- 4.00, 0.82, [4, 3, 5] [2, 2, 4]Distilled Agent DQN for Provable Adversarial Robustness
- 4.00, 0.00, [4, 4, 4] [4, 5, 4]Distinguishability of Adversarial Examples
- 4.00, 1.41, [6, 3, 3] [4, 5, 4]Iteratively Learning from the Best
- 4.00, 0.82, [5, 3, 4] [3, 4, 3]Evaluating GANs via Duality
- 4.00, 0.82, [4, 3, 5] [3, 4, 4]Constraining Action Sequences with Formal Languages for Deep Reinforcement Learning
- 4.00, 0.82, [4, 3, 5] [5, 5, 4]Overlapping Community Detection with Graph Neural Networks
- 4.00, 0.82, [4, 5, 3] [4, 3, 5]DEFactor: Differentiable Edge Factorization-based Probabilistic Graph Generation
- 4.00, 0.00, [4, 4, 4] [5, 4, 4]Conditional Inference in Pre-trained Variational Autoencoders via Cross-coding
- 4.00, 0.82, [4, 5, 3] [4, 3, 3]Prob2Vec: Mathematical Semantic Embedding for Problem Retrieval in Adaptive Tutoring
- 4.00, 2.16, [5, 1, 6] [4, 4, 4]Understanding the Effectiveness of Lipschitz-Continuity in Generative Adversarial Nets
- 4.00, 0.82, [4, 3, 5] [4, 3, 4]Reconciling Feature-Reuse and Overfitting in DenseNet with Specialized Dropout
- 4.00, 0.82, [3, 5, 4] [4, 5, 4]N/A
- 4.00, 0.82, [5, 4, 3] [5, 4, 5]Training Hard-Threshold Networks with Combinatorial Search in a Discrete Target Propagation Setting
- 4.00, 0.00, [4, 4, 4] [4, 4, 4]Latent Domain Transfer: Crossing modalities with Bridging Autoencoders
- 4.00, 0.82, [4, 3, 5] [5, 4, 3]Neural Regression Tree
- 4.00, 0.82, [5, 3, 4] [5, 2, 4]Neural MMO: A massively multiplayer game environment for intelligent agents
- 4.00, 0.00, [4, 4, 4] [4, 4, 4]Uncertainty-guided Lifelong Learning in Bayesian Networks
- 4.00, 0.00, [4, 4, 4] [5, 3, 5]Language Modeling with Graph Temporal Convolutional Networks
- 4.00, 0.82, [4, 5, 3] [4, 5, 5]RNNs with Private and Shared Representations for Semi-Supervised Sequence Learning
- 4.00, 1.41, [2, 5, 5] [2, 2, 3]Universal discriminative quantum neural networks
- 4.00, 0.00, [4, 4, 4] [4, 4, 5]Learning to Search Efficient DenseNet with Layer-wise Pruning
- 4.00, 0.82, [3, 5, 4] [5, 4, 5]Understanding Opportunities for Efficiency in Single-image Super Resolution Networks
- 4.00, 0.82, [3, 4, 5] [3, 5, 4]Q-map: a Convolutional Approach for Goal-Oriented Reinforcement Learning
- 4.00, 0.82, [3, 5, 4] [5, 4, 4]Deepström Networks
- 4.00, 0.82, [4, 3, 5] [4, 3, 4]Pearl: Prototype lEArning via Rule Lists
- 4.00, 0.82, [3, 5, 4] [4, 2, 5]Reinforced Pipeline Optimization: Behaving Optimally with Non-Differentiabilities
- 4.00, 0.00, [4, 4, 4] [4, 4, 4]Trajectory VAE for multi-modal imitation
- 4.00, 0.00, [4, 4, 4] [3, 4, 5]DATA POISONING ATTACK AGAINST NODE EMBEDDING METHODS
- 4.00, 0.00, [4, 4, 4] [4, 3, 4]Unsupervised Exploration with Deep Model-Based Reinforcement Learning
- 4.00, 1.63, [2, 6, 4] [4, 4, 3]On the Trajectory of Stochastic Gradient Descent in the Information Plane
- 4.00, 0.82, [5, 4, 3] [2, 4, 3]Functional Bayesian Neural Networks for Model Uncertainty Quantification
- 4.00, 1.63, [6, 2, 4] [4, 4, 4]REVISTING NEGATIVE TRANSFER USING ADVERSARIAL LEARNING
- 4.00, 0.00, [4, 4, 4] [3, 4, 4]Learning from Noisy Demonstration Sets via Meta-Learned Suitability Assessor
- 4.00, 0.00, [4, 4, 4] [5, 3, 4]Ain’t Nobody Got Time for Coding: Structure-Aware Program Synthesis from Natural Language
- 4.00, 0.00, [4, 4, 4] [4, 3, 4]Graph Generation via Scattering
- 4.00, 1.41, [3, 3, 6] [2, 5, 4]Improving machine classification using human uncertainty measurements
- 4.00, 0.82, [3, 4, 5] [4, 5, 3]Empirically Characterizing Overparameterization Impact on Convergence
- 4.00, 0.00, [4, 4, 4] [4, 5, 4]Continual Learning via Explicit Structure Learning
- 3.67, 1.25, [5, 2, 4] [5, 4, 4]R ESIDUAL NETWORKS CLASSIFY INPUTS BASED ON THEIR NEURAL TRANSIENT DYNAMICS
- 3.67, 0.47, [4, 3, 4] [5, 3, 4]Diminishing Batch Normalization
- 3.67, 1.70, [6, 3, 2] [3, 4, 1]Filter Training and Maximum Response: Classification via Discerning
- 3.67, 1.25, [4, 2, 5] [4, 4, 5]Optimizing for Generalization in Machine Learning with Cross-Validation Gradients
- 3.67, 0.47, [3, 4, 4] [3, 4, 3]Image Score: how to select useful samples
- 3.67, 0.47, [3, 4, 4] [3, 4, 2]Feature Attribution As Feature Selection
- 3.67, 1.25, [5, 4, 2] [3, 5, 5]Discrete Structural Planning for Generating Diverse Translations
- 3.67, 0.47, [4, 4, 3] [3, 4, 4]DynCNN: An Effective Dynamic Architecture on Convolutional Neural Network for Surveillance Videos
- 3.67, 0.47, [4, 3, 4] [4, 5, 3]An Attention-Based Model for Learning Dynamic Interaction Networks
- 3.67, 2.49, [3, 1, 7] [4, 5, 3]Optimization on Multiple Manifolds
- 3.67, 0.47, [3, 4, 4] [4, 5, 4]RETHINKING SELF-DRIVING : MULTI -TASK KNOWLEDGE FOR BETTER GENERALIZATION AND ACCIDENT EXPLANATION ABILITY
- 3.67, 2.49, [1, 7, 3] [5, 3, 4]Why Do Neural Response Generation Models Prefer Universal Replies?
- 3.67, 0.47, [4, 3, 4] [4, 4, 4]DelibGAN: Coarse-to-Fine Text Generation via Adversarial Network
- 3.67, 0.47, [3, 4, 4] [4, 5, 4]Encoding Category Trees Into Word-Embeddings Using Geometric Approach
- 3.67, 0.94, [3, 3, 5] [5, 5, 4]GradMix: Multi-source Transfer across Domains and Tasks
- 3.67, 0.47, [3, 4, 4] [5, 4, 3]Synthnet: Learning synthesizers end-to-end
- 3.67, 0.47, [4, 4, 3] [5, 4, 4]Prior Networks for Detection of Adversarial Attacks
- 3.67, 0.94, [3, 3, 5] [5, 4, 3]Localized random projections challenge benchmarks for bio-plausible deep learning
- 3.67, 0.94, [3, 5, 3] [2, 2, 3]A fully automated periodicity detection in time series
- 3.67, 0.47, [4, 4, 3] [5, 4, 4]Generating Images from Sounds Using Multimodal Features and GANs
- 3.67, 0.94, [5, 3, 3] [4, 5, 4]Text Embeddings for Retrieval from a Large Knowledge Base
- 3.67, 0.47, [4, 4, 3] [5, 4, 5]Explaining AlphaGo: Interpreting Contextual Effects in Neural Networks
- 3.67, 0.47, [3, 4, 4] [3, 4, 4]Riemannian Stochastic Gradient Descent for Tensor-Train Recurrent Neural Networks
- 3.67, 0.47, [4, 4, 3] [4, 3, 4]Learning agents with prioritization and parameter noise in continuous state and action space
- 3.67, 0.47, [4, 3, 4] [3, 5, 4]Hierarchical Attention: What Really Counts in Various NLP Tasks
- 3.67, 0.47, [3, 4, 4] [4, 3, 4]Radial Basis Feature Transformation to Arm CNNs Against Adversarial Attacks
- 3.67, 0.47, [4, 3, 4] [4, 2, 4]Using Deep Siamese Neural Networks to Speed up Natural Products Research
- 3.67, 0.47, [4, 3, 4] [3, 5, 4]Graph Spectral Regularization For Neural Network Interpretability
- 3.67, 0.47, [4, 4, 3] [4, 3, 5]Few-Shot Intent Inference via Meta-Inverse Reinforcement Learning
- 3.67, 0.47, [4, 4, 3] [2, 4, 4]Using Word Embeddings to Explore the Learned Representations of Convolutional Neural Networks
- 3.67, 0.47, [4, 3, 4] [5, 5, 4]Question Generation using a Scratchpad Encoder
- 3.67, 0.47, [3, 4, 4] [4, 4, 4]Adversarially Robust Training through Structured Gradient Regularization
- 3.67, 0.47, [3, 4, 4] [5, 4, 3]GEOMETRIC AUGMENTATION FOR ROBUST NEURAL NETWORK CLASSIFIERS
- 3.67, 1.25, [2, 5, 4] [4, 3, 4]DEEP HIERARCHICAL MODEL FOR HIERARCHICAL SELECTIVE CLASSIFICATION AND ZERO SHOT LEARNING
- 3.67, 0.47, [4, 4, 3] [4, 5, 4]Mixture of Pre-processing Experts Model for Noise Robust Deep Learning on Resource Constrained Platforms
- 3.67, 0.47, [4, 3, 4] [5, 4, 3]Feature Transformers: A Unified Representation Learning Framework for Lifelong Learning
- 3.67, 0.47, [3, 4, 4] [4, 4, 5]Normalization Gradients are Least-squares Residuals
- 3.67, 0.47, [4, 3, 4] [5, 4, 4]DEEP GEOMETRICAL GRAPH Classification WITH DYNAMIC POOLING
- 3.67, 1.25, [4, 2, 5] [4, 5, 4]Differentiable Greedy Networks
- 3.67, 1.25, [5, 2, 4] [4, 5, 4]Kmer2vec: Towards transcriptomic representations by learning kmer embeddings
- 3.67, 0.47, [4, 3, 4] [4, 4, 5]Graph Learning Network: A Structure Learning Algorithm
- 3.67, 0.47, [3, 4, 4] [3, 5, 4]Controlling Over-generalization and its Effect on Adversarial Examples Detection and Generation
- 3.67, 0.47, [4, 3, 4] [4, 4, 4]PCNN: Environment Adaptive Model Without Finetuning
- 3.67, 0.47, [3, 4, 4] [4, 4, 5]Optimized Gated Deep Learning Architectures for Sensor Fusion
- 3.67, 0.47, [3, 4, 4] [4, 3, 5]A Walk with SGD: How SGD Explores Regions of Deep Network Loss?
- 3.67, 0.94, [3, 3, 5] [4, 4, 3]Automatic generation of object shapes with desired functionalities
- 3.67, 0.47, [4, 4, 3] [4, 5, 5]Dynamic Recurrent Language Model
- 3.67, 0.94, [3, 5, 3] [5, 1, 4]D-GAN: Divergent generative adversarial network for positive unlabeled learning and counter-examples generation
- 3.67, 0.47, [3, 4, 4] [4, 3, 4]Inhibited Softmax for Uncertainty Estimation in Neural Networks
- 3.67, 0.47, [4, 4, 3] [5, 5, 4]Unsupervised Video-to-Video Translation
- 3.67, 0.47, [3, 4, 4] [4, 3, 4]Efficient Federated Learning via Variational Dropout
- 3.67, 0.47, [4, 3, 4] [5, 5, 4]Contextual Recurrent Convolutional Model for Robust Visual Learning
- 3.67, 0.47, [4, 4, 3] [4, 4, 4]Unsupervised one-to-many image translation
- 3.67, 0.47, [3, 4, 4] [4, 3, 4]INTERPRETABLE CONVOLUTIONAL FILTER PRUNING
- 3.67, 0.94, [3, 3, 5] [5, 4, 3]Fake Sentence Detection as a Training Task for Sentence Encoding
- 3.67, 0.47, [4, 4, 3] [5, 3, 3]Accelerating first order optimization algorithms
- 3.67, 0.94, [3, 3, 5] [4, 4, 4]The Natural Language Decathlon: Multitask Learning as Question Answering
- 3.50, 1.12, [5, 2, 3, 4] [2, 5, 2, 3]Learning to Reinforcement Learn by Imitation
- 3.50, 0.50, [3, 3, 4, 4] [4, 2, 4, 3]LSH Microbatches for Stochastic Gradients: Value in Rearrangement
- 3.33, 0.47, [4, 3, 3] [3, 4, 4]Linearizing Visual Processes with Deep Generative Models
- 3.33, 0.47, [3, 3, 4] [4, 4, 3]Interpreting Layered Neural Networks via Hierarchical Modular Representation
- 3.33, 0.94, [4, 2, 4] [5, 4, 3]IEA: Inner Ensemble Average within a convolutional neural network
- 3.33, 0.47, [3, 3, 4] [4, 4, 4]Accidental exploration through value predictors
- 3.33, 0.47, [3, 3, 4] [5, 4, 3]Learning and Data Selection in Big Datasets
- 3.33, 0.47, [3, 3, 4] [4, 5, 4]Human Action Recognition Based on Spatial-Temporal Attention
- 3.33, 0.47, [3, 3, 4] [5, 3, 4]SHE2: Stochastic Hamiltonian Exploration and Exploitation for Derivative-Free Optimization
- 3.33, 0.47, [3, 4, 3] [5, 4, 4]Encoder Discriminator Networks for Unsupervised Representation Learning
- 3.33, 0.47, [4, 3, 3] [4, 4, 5]Understanding and Improving Sequence-Labeling NER with Self-Attentive LSTMs
- 3.33, 1.25, [3, 5, 2] [4, 5, 5]Geometric Operator Convolutional Neural Network
- 3.33, 0.47, [3, 4, 3] [5, 4, 5]Multi-Scale Stacked Hourglass Network for Human Pose Estimation
- 3.33, 0.47, [3, 4, 3] [5, 4, 5]A quantifiable testing of global translational invariance in Convolutional and Capsule Networks
- 3.33, 0.47, [3, 4, 3] [5, 5, 4]MAJOR-MINOR LSTMS FOR WORD-LEVEL LANGUAGE MODEL
- 3.33, 0.47, [3, 3, 4] [4, 4, 4]Deep models calibration with bayesian neural networks
- 3.33, 0.94, [4, 4, 2] [4, 3, 4]BIGSAGE: unsupervised inductive representation learning of graph via bi-attended sampling and global-biased aggregating
- 3.33, 1.25, [3, 2, 5] [4, 5, 3]Gradient Acceleration in Activation Functions
- 3.33, 0.47, [4, 3, 3] [5, 5, 4]BEHAVIOR MODULE IN NEURAL NETWORKS
- 3.33, 0.47, [3, 4, 3] [4, 3, 4]Neural Random Projections for Language Modelling
- 3.33, 0.47, [3, 4, 3] [4, 4, 5]Step-wise Sensitivity Analysis: Identifying Partially Distributed Representations for Interpretable Deep Learning
- 3.33, 0.94, [2, 4, 4] [4, 4, 3]Deconfounding Reinforcement Learning
- 3.33, 0.94, [2, 4, 4] [5, 4, 4]Detecting Topological Defects in 2D Active Nematics Using Convolutional Neural Networks
- 3.33, 0.47, [3, 3, 4] [3, 4, 5]Neural Distribution Learning for generalized time-to-event prediction
- 3.33, 0.47, [4, 3, 3] [3, 4, 5]Beyond Games: Bringing Exploration to Robots in Real-world
- 3.33, 0.47, [3, 4, 3] [4, 5, 4]Empirical Study of Easy and Hard Examples in CNN Training
- 3.33, 1.70, [1, 5, 4] [4, 3, 4]Deterministic Policy Gradients with General State Transitions
- 3.33, 0.47, [4, 3, 3] [4, 5, 4]Neural Network Regression with Beta, Dirichlet, and Dirichlet-Multinomial Outputs
- 3.33, 0.47, [3, 3, 4] [2, 4, 3]ATTACK GRAPH CONVOLUTIONAL NETWORKS BY ADDING FAKE NODES
- 3.33, 1.25, [3, 2, 5] [4, 5, 2]Generative model based on minimizing exact empirical Wasserstein distance
- 3.33, 1.25, [5, 2, 3] [3, 2, 5]Learning powerful policies and better dynamics models by encouraging consistency
- 3.33, 0.47, [3, 4, 3] [5, 3, 4]Non-Synergistic Variational Autoencoders
- 3.33, 1.25, [5, 2, 3] [3, 5, 5]Uncertainty in Multitask Transfer Learning
- 3.33, 1.25, [5, 2, 3] [3, 4, 3]The Conditional Entropy Bottleneck
- 3.33, 0.47, [4, 3, 3] [3, 3, 4]Visualizing and Understanding the Semantics of Embedding Spaces via Algebraic Formulae
- 3.33, 0.47, [4, 3, 3] [4, 2, 4]Combining adaptive algorithms and hypergradient method: a performance and robustness study
- 3.00, 0.82, [2, 4, 3] [4, 4, 5]ATTENTION INCORPORATE NETWORK: A NETWORK CAN ADAPT VARIOUS DATA SIZE
- 3.00, 0.00, [3, 3, 3] [5, 4, 3]Nonlinear Channels Aggregation Networks for Deep Action Recognition
- 3.00, 0.82, [4, 2, 3] [5, 5, 4]Hybrid Policies Using Inverse Rewards for Reinforcement Learning
- 3.00, 0.82, [2, 4, 3] [4, 4, 5]An Exhaustive Analysis of Lazy vs. Eager Learning Methods for Real-Estate Property Investment
- 3.00, 0.82, [2, 4, 3] [5, 5, 5]Stacking for Transfer Learning
- 3.00, 0.00, [3, 3, 3] [5, 3, 4]Mapping the hyponymy relation of wordnet onto vector Spaces
- 3.00, 0.82, [2, 4, 3] [5, 4, 4]ReNeg and Backseat Driver: Learning from demonstration with continuous human feedback
- 3.00, 0.00, [3, 3, 3] [3, 4, 3]Real-time Neural-based Input Method
- 3.00, 1.00, [4, 2] [5, 4]FROM DEEP LEARNING TO DEEP DEDUCING: AUTOMATICALLY TRACKING DOWN NASH EQUILIBRIUM THROUGH AUTONOMOUS NEURAL AGENT, A POSSIBLE MISSING STEP TOWARD GENERAL A.I.
- 3.00, 0.82, [4, 2, 3] [4, 2, 1]Learning of Sophisticated Curriculums by viewing them as Graphs over Tasks
- 3.00, 0.00, [3, 3, 3] [5, 4, 5]iRDA Method for Sparse Convolutional Neural Networks
- 3.00, 0.82, [3, 4, 2] [2, 4, 5]Geometry of Deep Convolutional Networks
- 3.00, 0.82, [4, 2, 3] [5, 4, 4]Calibration of neural network logit vectors to combat adversarial attacks
- 3.00, 0.82, [2, 4, 3] [4, 2, 4]Probabilistic Program Induction for Intuitive Physics Game Play
- 3.00, 0.00, [3, 3, 3] [4, 3, 2]An Analysis of Composite Neural Network Performance from Function Composition Perspective
- 3.00, 0.00, [3, 3, 3] [3, 2, 4]Dopamine: A Research Framework for Deep Reinforcement Learning
- 3.00, 0.82, [3, 2, 4] [4, 4, 4]Learning with Reflective Likelihoods
- 3.00, 1.41, [4, 1, 4] [5, 4, 3]Variational Autoencoders for Text Modeling without Weakening the Decoder
- 3.00, 0.82, [4, 3, 2] [4, 3, 4]Evaluation Methodology for Attacks Against Confidence Thresholding Models
- 3.00, 0.00, [3, 3, 3] [4, 3, 3]A NON-LINEAR THEORY FOR SENTENCE EMBEDDING
- 3.00, 0.82, [4, 3, 2] [4, 3, 5]Learn From Neighbour: A Curriculum That Train Low Weighted Samples By Imitating
- 3.00, 0.00, [3, 3, 3] [4, 4, 4]One Bit Matters: Understanding Adversarial Examples as the Abuse of Redundancy
- 3.00, 0.82, [4, 3, 2] [2, 3, 4]Feature quantization for parsimonious and interpretable predictive models
- 3.00, 0.00, [3, 3, 3] [4, 5, 4]Featurized Bidirectional GAN: Adversarial Defense via Adversarially Learned Semantic Inference
- 3.00, 0.82, [2, 3, 4] [5, 4, 4]HR-TD: A Regularized TD Method to Avoid Over-Generalization
- 3.00, 0.82, [4, 3, 2] [4, 4, 1]HANDLING CONCEPT DRIFT IN WIFI-BASED INDOOR LOCALIZATION USING REPRESENTATION LEARNING
- 3.00, 0.82, [2, 3, 4] [3, 3, 4]A Rate-Distortion Theory of Adversarial Examples
- 3.00, 0.82, [2, 3, 4] [5, 5, 3]Classification in the dark using tactile exploration
- 3.00, 0.00, [3, 3, 3] [4, 5, 4]End-to-End Multi-Lingual Multi-Speaker Speech Recognition
- 3.00, 0.82, [2, 3, 4] [4, 5, 5]A Self-Supervised Method for Mapping Human Instructions to Robot Policies
- 3.00, 0.82, [2, 3, 4] [3, 4, 4]ATTENTIVE EXPLAINABILITY FOR PATIENT TEMPO- RAL EMBEDDING
- 3.00, 0.00, [3, 3, 3] [4, 5, 5]From Amortised to Memoised Inference: Combining Wake-Sleep and Variational-Bayes for Unsupervised Few-Shot Program Learning
- 2.75, 0.83, [4, 2, 3, 2] [5, 5, 3, 4]Predictive Local Smoothness for Stochastic Gradient Methods
- 2.67, 0.94, [4, 2, 2] [4, 5, 4]Multiple Encoder-Decoders Net for Lane Detection
- 2.67, 0.47, [2, 3, 3] [5, 2, 4]Explaining Adversarial Examples with Knowledge Representation
- 2.67, 1.25, [4, 1, 3] [5, 5, 2]Weak contraction mapping and optimization
- 2.67, 0.47, [2, 3, 3] [5, 3, 5]Exponentially Decaying Flows for Optimization in Deep Learning
- 2.67, 0.94, [2, 2, 4] [5, 4, 3]VARIATIONAL SGD: DROPOUT , GENERALIZATION AND CRITICAL POINT AT THE END OF CONVEXITY
- 2.67, 0.47, [2, 3, 3] [5, 3, 5]Faster Training by Selecting Samples Using Embeddings
- 2.67, 0.47, [3, 2, 3] [5, 5, 4]Decoupling Gating from Linearity
- 2.67, 0.47, [2, 3, 3] [5, 4, 3]End-to-End Learning of Video Compression Using Spatio-Temporal Autoencoders
- 2.67, 2.36, [6, 1, 1] [5, 5, 5]How Powerful are Graph Neural Networks?
- 2.67, 0.94, [4, 2, 2] [4, 4, 4]A bird’s eye view on coherence, and a worm’s eye view on cohesion
- 2.67, 0.47, [3, 3, 2] [5, 4, 4]HAPPIER: Hierarchical Polyphonic Music Generative RNN
- 2.67, 1.25, [3, 1, 4] [4, 4, 3]Learning Goal-Conditioned Value Functions with one-step Path rewards rather than Goal-Rewards
- 2.67, 0.47, [2, 3, 3] [2, 2, 3]A CASE STUDY ON OPTIMAL DEEP LEARNING MODEL FOR UAVS
- 2.50, 0.50, [2, 3] [3, 4]A Solution to China Competitive Poker Using Deep Learning
- 2.33, 0.94, [3, 1, 3] [5, 5, 5]Training Variational Auto Encoders with Discrete Latent Representations using Importance Sampling
- 2.33, 0.94, [1, 3, 3] [3, 4, 3]Psychophysical vs. learnt texture representations in novelty detection
- 2.33, 0.94, [3, 1, 3] [5, 5, 3]Pixel Chem: A Representation for Predicting Material Properties with Neural Network
- 2.33, 0.47, [3, 2, 2] [4, 5, 5]VECTORIZATION METHODS IN RECOMMENDER SYSTEM
- 2.33, 0.47, [3, 2, 2] [5, 5, 4]Deli-Fisher GAN: Stable and Efficient Image Generation With Structured Latent Generative Space
- 2.33, 1.89, [5, 1, 1] [4, 5, 5]Advanced Neuroevolution: A gradient-free algorithm to train Deep Neural Networks
- 2.33, 0.47, [2, 2, 3] [3, 4, 3]Hierarchical Deep Reinforcement Learning Agent with Counter Self-play on Competitive Games
- 2.25, 0.43, [2, 2, 3, 2] [3, 3, 3, 4]A Synaptic Neural Network and Synapse Learning
- 2.00, 0.63, [1, 2, 2, 3, 2] [5, 4, 5, 4, 5]Hierarchical Bayesian Modeling for Clustering Sparse Sequences in the Context of Group Profiling
- 1.50, 0.50, [2, 1, 2, 1] [5, 5, 5, 5]Object detection deep learning networks for Optical Character Recognition