Portrait of Ying Zhang is unavailable

Ying Zhang

Alumni

Publications

BigDocs: An Open Dataset for Training Multi-modal Models on Document and Code Tasks
Xiangru Jian
Akshay Kalkunte
Amirhossein Abaskohi
Pierre-Andre Noel
Sanket Biswas … (see 23 more)
Sara Shanian
Noah Bolger
Kurt MacDonald
Simon Fauvel
Sathwik Tejaswi
Srinivas Sunkara
Joao Monteiro
Krishnamurthy Dj Dvijotham
Torsten Scholak
Sepideh Kharagani
Sean Hughes
M. Özsu
Christopher Pal
Sai Rajeswar
Multimodal AI has the potential to significantly enhance document-understanding tasks, such as processing receipts, understanding workflows,… (see more) extracting data from documents, and summarizing reports. Code generation tasks that require long-structured outputs can also be enhanced by multimodality. Despite this, their use in commercial applications is often limited due to limited access to training data and restrictive licensing, which hinders open access. To address these limitations, we introduce BigDocs-7.5M, a high-quality, open-access dataset comprising 7.5 million multimodal documents across 30 tasks. We use an efficient data curation process to ensure our data is high-quality and license-permissive. Our process emphasizes accountability, responsibility, and transparency through filtering rules, traceable metadata, and careful content analysis. Additionally, we introduce BigDocs-Bench, a benchmark suite with 10 novel tasks where we create datasets that reflect real-world use cases involving reasoning over Graphical User Interfaces (GUI) and code generation from images. Our experiments show that training with BigDocs-Bench improves average performance up to 25.8% over closed-source GPT-4o in document reasoning and structured output tasks such as Screenshot2HTML or Image2Latex generation. Finally, human evaluations showed a preference for outputs from models trained on BigDocs over GPT-4o. This suggests that BigDocs can help both academics and the open-source community utilize and improve AI tools to enhance multimodal capabilities and document reasoning. The project is hosted at https://bigdocs.github.io .
Correction: CEPC Technical Design Report: Accelerator
Waleed Abdallah
Tiago CarlosAdorno de Freitas
Konstantin Afanaciev
Shakeel Ahmad
Ijaz Ahmed
Xiaocong Ai
Abid Aleem
Wolfgang Altmannshofer
Fabio Alves
Weiming An
Rui An
Daniele Paolo Anderle
D. Anderle
Stefan Antusch
Yasuo Arai
Andrej Arbuzov
Abdesslam Arhrib
A. Arhrib
Mustafa Ashry
Sha Bai … (see 1076 more)
Yang Bai
Vipul Bairathi
Csaba Balazs
Philip Bambade
Yong Ban
Triparno Bandyopadhyay
Shou-Shan Bao
Desmond P. Barber
Ays¸e Bat
Varvara Batozskaya
Subash Chandra Behera
Alexander Belyaev
Michele Bertucci
Xiao-Jun Bi
Yuanjie Bi
Tianjian Bian
Tingting Bian
Fabrizio Bianchi
Thomas Bieko¨tter
Michela Biglietti
Shalva Bilanishvili
Deng Binglin
Lingling Men
Denis Bodrov
Anton Bogomyagkov
Serge Bondarenko
Stewart Boogert
Maarten Boonekamp
Marcello Borri
M. Borri
Angelo Bosotti
Vincent Boudry
Mohammed Boukidi
Igor Boyko
Ivanka Bozovic
Giuseppe Bozzi
Jean-Claude Brient
J. Brient
Anastasiia Budzinskaya
Masroor Bukhari
Vladimir Bytev
Giacomo Cacciapaglia
Hua Cai
Wenyong Cai
Wujun Cai
Yijian Cai
Yizhou Cai
Yuchen Cai
Haiying Cai
Huacheng Cai
Lorenzo Calibbi
Junsong Cang
Guofu Cao
Jianshe Cao
Antoine Chance
Xuejun Chang
Yue Chang
Zhe Chang
Xinyuan Chang
Wei Chao
Auttakit Chatrabhuti
Yimin Che
Yuzhi Che
Bin Chen
Danping Chen
Fuqing Chen
Fusan Chen
Gang Chen
Guoming Chen
Hua-Xing Chen
Huirun Chen
Jinhui Chen
Ji-Yuan Chen
Kai Chen
Mali Chen
Mingjun Chen
Mingshui Chen
Nanying Chen
Shanhong Chen
Shanzhen Chen
Shao-Long Chen
Shaomin Chen
Shiqiang Chen
Tianlu Chen
Wei Chen
Xiang Chen
Xiaoyu Chen
Xin Chen
Xun Chen
Xurong Chen
Ye Chen
Ying Chen
Yukai Chen
Zelin Chen
Zilin Chen
Boping Chen
Chunhui Chen
Haifeng Cheng
Huajie Cheng
Hok Chuen Cheng
Shan Cheng
Tongguang Cheng
Yunlong Chi
Pietro Chimenti
Wen Han Chiu
Guk Cho
Mingxing Chu
Ming-Chung Chu
X. Chu
Xiaotong Chu
Ziliang Chu
Guglielmo Coloretti
Andreas Crivellin
Hanhua Cui
Xiaohao Cui
Zhaoyuan Cui
B. D’Anzi
Brunella D’Anzi
Ling-Yun Dai
Xinchen Dai
Xuwen Dai
Antonio De Maria
Nicola De Filippis
Christophe De La Taille
Francesca De Mori
Chiara De Sio
Elisa Del Core
Shuangxue Deng
Wei Deng
Wei-Tian Deng
Zhi Deng
Ziyan Deng
Bhupal Dev
Tang Dewen
Biagio Di Micco
Ran Ding
Siqin Ding
Yadong Ding
Haiyi Dong
Jianing Dong
Jing Dong
Lan Dong
Mingyi Dong
Xu Dong
Yipei Dong
Yubing Dong
Milos Dordevic
Marco Drewes
Mingxuan Du
Qianqian Du
Xiaokang Du
Yanyan Du
Yong Du
Yunfei Du
Chun-Gui Duan
Zhe Duan
Yahor Dydyshka
Ulrik Egede
Walaa Elmetenawee
Yun Eo
Ka Yan Fan
Kuanjun Fan
Yunyun Fan
Bo Fang
Shuangshi Fang
Yuquan Fang
Ada Farilla
Riccardo Farinelli
Muhammad Farooq
A. F. Golfe
Almaz Fazliakhmetov
Angeles Faus Golfe
Rujun Fei
Bo Feng
Chong Feng
Junhua Feng
Xu Feng
Zhuoran Feng
ZhuoranFeng
Luis Roberto Flores Castillo
Etienne Forest
Andrew Fowlie
H. Fox
Harald Fox
Hai-Bing Fu
Jinyu Fu
Benjamin Fuks
Yoshihiro Funakoshi
Emidio Gabrielli
Nan Gan
Li Gang
Meisen Gao
Wenbin Gao
Wenchun Gao
Yu Gao
Yuanning Gao
Zhanxiang Gao
Yanyan Gao
Kun Ge
Shao-Feng Ge
Zhenwu Ge
Li-Sheng Geng
Qinglin Geng
Hao Zeng
Chao-Qiang Geng
Swagata Ghosh
Antonio Gioiosa
Leonid Gladilin
Ti Gong
Stefania Gori
Quanbu Gou
Sebastian Grinstein
Chenxi Gu
Gerardo Guillermo
Joao Guimaraes da Costa
Dizhou Guo
Fangyi Guo
Jiacheng Guo
Jun Guo
Lei Guo
Xia Guo
Xinyang Guo
Xin-Heng Guo
Yunqiang Guo
Yuping Guo
Yun Guo
Zhi-Hui Guo
Alejandro Gutie´rrez-Rodríguez
Seungkyu Ha
Noman Habib
Jan Hajer
Francois Hammer
Chengcheng Han
Huayong Han
Jifeng Han
Liangliang Han
Liang Han
Rao Zhang
Yang Han
Ruixiong Han
Yezi Han
Yuanying Han
T. T. Han
Jiankui Hao
Xiqing Hao
Qiang Zhao
Chuanqi He
Dayong He
Dongbing He
Guangyuan He
Hong-Jian He
Jibo He
Jun He
Longyan He
Xiang He
Xiao-Gang He
Zhenqiang He
Klaus Heinemann
Sven Heinemeyer
Yuekun Heng
María A. Herna´ndez-Ruíz
Jiamin Hong
Yuenkeung Hor
George W. S. Hou
Xiantao Hou
X. Hou
Xiaonan Hou
Zhilong Hou
Suen Hou
Caishi Hu
Chen Hu
Dake Hu
Haiming Hu
Jiagen Hu
Jun Hu
Kun Hu
Shouyang Hu
Yongcai Hu
Yu Hu
Zhen Hu
Z. Hua
Zhehao Hua
Jianfei Hua
Chao-Shang Huang
Fa Peng Huang
Guangshun Huang
Jinshu Huang
Ke Huang
Liangsheng Huang
Shuhui Huang
X. T. Huang
Xu-Guang Huang
Yanping Huang
Yonggang Huang
Yongsheng Huang
Zimiao Huang
Yuanyuan Wei
Chen Huanyuan
Changgi Huh
Jiaqi Hui
Lihua Huo
Talab Hussain
Kyuyeong Hwang
Ara Ioannisian
Munawar Iqbal
Paul Jackson
Shahriyar Jafarzade
Haeun Jang
Seoyun Jang
Daheng Ji
Q. Ji
Qingping Ji
Quan Ji
Xiaolu Ji
Jingguang Jia
Jinsheng Jia
X. Q. Jia
Xuewei Jia
Zihang Jia
Cailian Jiang
Han Ren Jiang
Houbing Jiang
Jun Jiang
Xiaowei Jiang
Xin Jiang
Xuhui Jiang
Yongcheng Jiang
Zhongjian Jiang
Cheng Jiang
Ruiqi Jiao
Dapeng Jin
Shan Jin
Song Jin
Yi Jin
Junji Jis
Sunghoon Jung
Goran Kacarevic
Eric Kajfasz
Lidia Kalinovskaya
Aleksei Kampf
Wen Kang
Xian-Wei Kang
Xiaolin Kang
Biswajit Karmakar
Zhiyong Ke
Rijeesh Keloth
Alamgir Khan
Hamzeh Khanpour
Khanchai Khosonthongkee
KhanchaiKhosonthongkee
Bobae Kim
Dongwoon Kim
Mi Ran Kim
Minsuk Kim
Sungwon Kim
On Kim
Michael Klasen
Sanghyun Ko
S. Ko
Ivan Koop
Vitaliy Kornienko
Bryan Kortman
Gennady Kozlov
Shiqing Kuang
Mukesh Kumar
Chia Ming Kuo
Tsz Hong Kwok
Franc¸ois Sylvain Ren Lagarde
F. Lagarde
Pei-Zhu Lai
Imad Laktineh
Xiaofei Lan
Zuxiu Lan
Lia Lavezzi
Justin Lee
Junghyun Lee
Sehwook Lee
Ge Lei
Roy Lemmon
Yongxiang Leng
Sze Ching Leung
Hai Tao Li
Bingzhi Li
Bin Li
Changhong Li
Chao Li
Cheng Li
Chunhua Li
Cui Li
Dazhang Li
Dikai Li
Yi Wang
Gaosong Li
Haibo Li
Haifeng Li
Hai-Jun Li
Haotian Li
Hengne Li
Honglei Li
Huijing Li
Jialin Li
Jingyi Li
Jun Li
Leyi Li
Li Li
Jinmian Li
Mei Li
Meng Li
Minxian Li
Ling Li
Pei-Rong Li
Qiang Li
Shaopeng Li
Shenghe Li
Shu Li
Shuo Li
Teng Li
Tiange Li
Tong Li
Weichang Li
Weidong Li
Wenjun Li
Xiaoling Li
Xiaomei Li
Xiao-Nan Li
Xiaoping Li
Xiaoting Li
Xin Li
Xinqiang Li
Xuekang Li
Yang Li
Yanwei Li
Yiming Li
Ying Li
Ying-Ying Li
Yonggang Li
Yonglin Li
Yufeng Li
Yuhui Li
Zhan Li
Zhao Li
Zhiji Li
Lingfeng Li
Jing Liang
Jinhan Liang
Zhijun Liang
Guangrui Liao
Hean Liao
Jiaming Yan
Fei Li
Libo Liao
Longzhou Liao
Yipu Liao
Ayut Limphirat
AyutLimphirat
Jiajun Liao
Tao Lin
Weiping Lin
Yufu Lin
Y. P. Liao
Yugen Lin
Beijiang Liu
Bo Liu
Danning Liu
Dong Liu
Fu-Hu Liu
Hongbang Liu
Huangcheng Liu
H. Liu
Huiling Liu
Jia Liu
Jiaming Liu
Jianbei Liu
Jianyi Liu
Jingdong Liu
Jinhua Liu
Kai Liu
Kang Liu
Kun Liu
Mengyao Liu
Pengcheng Liu
Qibin Liu
Shan Liu
Shidong Liu
Shuang Liu
Shubin Liu
Peng Liu
Tao Liu
Tong Liu
W. M. Liu
Xiang Liu
Xiaohui Liu
Xiaoyu Liu
Jian Li
Xinglin Liu
Xingquan Liu
Yang Liu
Xiao-Hai Liu
Yanlin Liu
Yao-Bei Liu
Yi Liu
Yiming Liu
Yonglu Liu
Yubin Liu
Yudong Liu
Yulong Liu
Zhaofeng Liu
Zhenchao Liu
Zhi Liu
Zhi-Feng Liu
Zhiqing Liu
Zhongfu Liu
Zuowei Liu
Mia Liu
Xiaoyang Liu
Xinchou Lou
Cai-Dian Lu
Jun-Xu Lu
Qiu Zhen Lu
Shang Lu
Wenxi Lu
Xiaohan Lu
Yunpeng Lu
Zhiyong Lu
Xianguo Lu
Wei Lu
Bayarto Lubsandorzhiev
Sultim Lubsandorzhiev
Arslan Lukanov
Jinliang Luo
T. Luo
xiaoan Luo
Xiaofeng Luo
Xiaolan Luo
Jindong Lv
Feng Lyu
Xiao-Rui Lyu
Kun-Feng Lyu
Ande Ma
Hong-Hao Ma
Jun-Li Ma
Kai Ma
Lishuang Ma
Na Ma
Renjie Ma
Weihu Ma
Xinpeng Ma
Yanling Ma
Yan-Qing Ma
Yongsheng Ma
Zhonghui Ma
Zhongjian Ma
Yang Ma
Mousam Maity
Lining Mao
Yanmin Mao
Yaxian Mao
Aure´lien Martens
Caccia Massimo Luigi Maria
Shigeki Matsumoto
Bruce Mellado
Davide Meloni
Cai Meng
Lingxin Meng
Zhenghui Mi
Yuhui Miao
Mauro Migliorati
Lei Ming
Vasiliki A. Mitsou
Laura Monaco
Arthur Moraes
Karabo Mosala
Ahmad Moursy
Lichao Mu
Zhihui Mu
Nickolai Muchnoi
Daniel Muenstermann
Pankaj Munbodh
William John Murray
Jérôme Nanni
Dmitry Nanzanov
Changshan Nie
Sergei Nikitin
Feipeng Ning
Guozhu Ning
Jia-Shu Niu
Juan-Juan Niu
Yan Niu
Edward Khomotso Nkadimeng
Kazuhito Ohmi
Katsunobu Oide
Hideki Okawa
Mohamed Ouchemhou
Qun Ouyang
Daniele Paesani
Carlo Pagani
Stathes Paganis
Collette Pakuza
Jiangyang Pan
Juntong Pan
Tong Pan
Xiang Pan
Papia Panda
Saraswati Pandey
Mila Pandurovic
Rocco Paparella
Roman Pasechnik
Emilie Passemar
Hua Pei
Xiaohua Peng
Xinye Peng
Yuemei Peng
Jialun Ping
Ronggang Ping
Souvik Priyam Adhya
Baohua Qi
Hang Qi
Huirong Qi
M. Qi
Sen Qian
Zhuoni Qian
Congfeng Qiao
Guangyou Qin
Jiajia Qin
Laishun Qin
Liqing Qin
Qin Qin
Xiaoshuai Qin
Zhonghua Qin
Guofeng Qu
Antonio Racioppi
Michael Ramsey-Musolf
Shabbar Raza
Vladimir Rekovic
Jing Ren
Ju¨rgen Reuter
Tania Robens
Giancarlo Rossi
Manqi Ruan
Leonid Rumyantsev
Min Sang Ryu
Renat Sadykov
Minjing Sang
Juan Jose´ Sanz-Cillero
Miroslav Saur
Nishil Savla
Michael A. Schmidt
Daniele Sertore
Ron Settles
Peng Sha
Ding-Yu Shao
Ligang Shao
Hua-Sheng Shao
Xin She
Chuang Shen
Hong-Fei Shen
Jian-Ming Shen
Peixun Shen
Qiuping Shen
Zhongtao Shen
Shuqi Sheng
Haoyu Shi
Hua Shi
Qi Shi
Shusu Shi
Xiaolei Shi
Xin Shi
Yukun Shi
Zhan Shi
Ian Shipsey
Gary Shiu
Chang Shu
Zong-Guo Si
Andrei Sidorenkov
Ivan Smiljanić
Aodong Song
Huayang Song
Jiaojiao Song
Jinxing Song
Siyuan Song
Weimin Song
Weizheng Song
Zhi Song
Shashwat Sourav
Paolo Spruzzola
Feng Su
Shengsen Su
Wei Su
Shufang Su
Yanfeng Sui
Zexuan Sui
Michael Sullivan
Baiyang Sun
Guoqiang Sun
Hao Sun
Hao-Kai Sun
Junfeng Sun
Liang Sun
Mengcheng Sun
Pengfei Sun
Sichun Sun
Xianjing Sun
Xiaohu Sun
Xilei Sun
Xingyang Sun
Xin-Yuan Sun
Yanjun Sun
Yongzhao Sun
Yue Sun
Zheng Sun
Narumon Suwonjandee
Elsayed Tag Eldin
Biao Tan
Bo Tang
Chuanxiang Tang
Gao Tang
Guangyi Tang
Jingyu Tang
Liang Tang
Ying’Ao Tang
Junquan Tao
Abdel Nasser Tawfik
Geoffrey Taylor
Valery Telnov
Saike Tian
Riccardo Torre
Wladyslaw Henryk Trzaska
Dmitri Tsybychev
Yanjun Tu
Shengquan Tuo
Michael Tytgat
Ghalib Ul Islam
Nikita Ushakov
German Valencia
Jaap Velthuis
Alessandro Vicini
Trevor Vickey
Ivana Vidakovic
Henri Videau
Raymond Volkas
Dmitry Voronin
Natasa Vukasinovic
Xia Wan
Xuying Wan
X. Wang
Anqing Wang
B. Wang
Chengtao Wang
Chuanye Wang
Ci Wang
Meng Wang
Dou Wang
En Wang
Guanwen Wang
Guo-Li Wang
Haijing Wang
Haolin Wang
Jianchun Wang
JianLi Wang
Jiawei Wang
Jin Wang
Jin-Wei Wang
Joseph Wang
Kechen Wang
Lechun Wang
Wei Wang
Liguo Wang
Lijiao Wang
Lu Wang
Na Wang
Pengcheng Wang
Qi Wang
Qun Wang
Shu Lin Wang
Shudong Wang
Taofeng Wang
Tianhong Wang
Tianyang Wang
Xiaolong Wang
Xiaoning Wang
Xiao-Ping Wang
Xiongfei Wang
Xujian Wang
Yaping Wang
Yaqian Wang
Yiao Wang
Yifang Wang
Yilun Wang
Yiwei Wang
You-Kai Wang
Yuanping Wang
Yuexin Wang
Yuhao Wang
Yu-Ming Wang
Yuting Wang
Zhen Wang
Zhigang Wang
Weiping Wang
Zeren Simon Wang
Biao Wang
Hao Wang
Lian-Tao Wang
Zihui Wang
Zirui Wang
Jia Wang
Tong Wang
Daihui Wei
Shujun Wei
Wei Wei
Xiaomin Wei
Yingjie Wei
Liangjian Wen
Xuejun Wen
Yufeng Wen
Martin White
Peter Williams
Zef Wolffs
William John Womersley
Baona Wu
Bobing Wu
Guanjian Wu
Jinfei Wu
Lei Wu
Lina Wu
Linghui Wu
Minlin Wu
Peiwen Wu
Qi Wu
Qun Wu
Tianya Wu
Xiang Wu
Xiaohong Wu
Xing-Gang Wu
Xuehui Wu
Yaru Wu
Yongcheng Wu
Yuwen Wu
Zhi Wu
Xin Wu
Lei Xia
Ligang Xia
Shang Xia
Benhou Xiang
Dao Xiang
Zhiyu Xiang
Bo-Wen Xiao
Chu-Wen Xiao
Dunming Xiao
Guangyan Xiao
Han Xiao
Min Xiao
Ouzheng Xiao
Rui-Qing Xiao
Xiang Xiao
Yichen Xiao
Yu Xiao
Yunlong Xiao
Zhenjun Xiao
Hengyuan Xiao
Nian Xie
Yuehong Xie
Tianmu Xin
Ye Xing
Zhizhong Xing
Da Xu
Fang Xu
Fanrong Xu
Haisheng Xu
Haocheng Xu
Ji Xu
Miaofu Xu
Qingjin Xu
Qingnian Xu
W. Xu
Weixi Xu
Xinping Xu
Zijun Xu
Zehua Xu
Yaoyuan Xu
Feifei Xue
Baojun Yan
Bin Yan
Fen Yan
Fucheng Yan
Liang Yan
Qi-Shu Yan
Wenbiao Yan
Yupeng Yan
Luping Yan
Haoyue Yan
Dong Yang
Fengying Yang
Guicheng Yang
Haijun Yang
Jin Min Yang
Jing Yang
Lan Yang
Li Yang
Li Lin Yang
Lili Yang
Litao Yang
Mei Yang
Qiaoli Yang
Tiansen Yang
Xiaochen Yang
Yingjun Yang
Yueling Yang
Zhengyong Yang
Zhenwei Yang
Youhua Yang
Xiancong Yang
De-Liang Yao
Shi Yao
Lei Ye
Lingxi Ye
Mei Ye
Rui Ye
Yecheng Ye
Vitaly Yermolchyk
Kai Yi
Li Yi
Yang Yi
Di Yin
Peng-Fei Yin
Shenghua Yin
Ze Yin
Zhongbao Yin
Zhang Yinhong
Hwi Dong Yoo
Zhengyun You
Charles Young
Boxiang Yu
Chenghui Yu
Fusheng Yu
Jie-Sheng Yu
Jinqing Yu
Lingda Yu
Zhao-Huan Yu
Felix Yu
Bingrong Yu
Changzheng Yuan
Li Yuan
Xing-Bo Yuan
Youjin Yuan
Junhui Yue
Qian Yue
Baobiao Yue
Un Nisa Zaib
Riccardo Zanzottera
Min Zeng
Jian Zhai
Jiyuan Zhai
Xin Zhe Zhai
Xi-Jie Zhan
Ben-Wei Zhang
Bolun Zhang
Di Zhang
Guangyi Zhang
Hao Zhang
Hong-Hao Zhang
Huaqiao Zhang
Hui Zhang
Jian Wang
Jianzhong Zhang
Jiehao Zhang
Jielei Zhang
Jingru Zhang
Jinxian Zhang
Junsong Zhang
Junxing Zhang
Lei Zhang
Liang Zhang
Licheng Zhang
Liming Zhang
Linhao Zhang
Mengchao Zhang
Shulei Zhang
Wan Zhang
Wenchao Zhang
Xiangzhen Zhang
Xiaomei Zhang
Xiaoming Zhang
Xiaoxu Zhang
Xiaoyu Zhang
Xuantong Zhang
Xueyao Zhang
Yang Zhang
Yanxi Zhang
Yao Zhang
Yixiang Zhang
Yizhou Zhang
Yongchao Zhang
Yu Zhang
Yuan Zhang
Yujie Zhang
Yulei Zhang
Yumei Zhang
Yunlong Zhang
Zhandong Zhang
Zhaoru Zhang
Zhen-Hua Zhang
Zhenyu Zhang
Zhichao Zhang
Zhi-Qing Zhang
Zhuo Zhang
Zhiqing Zhang
Cong Zhang
Tianliang Zhang
Luyan Zhang
Guang Zhao
Hongyun Zhao
Jie Zhao
Jingxia Zhao
Jingyi Zhao
Ling Zhao
Luyang Zhao
Mei Zhao
Minggang Zhao
Mingrui Zhao
Ruiguang Zhao
Tongxian Zhao
Yaliang Zhao
Ying Zhao
Yue Zhao
Zhiyu Zhao
Zhuo Zhao
Alexey Zhemchugov
Hongjuan Zheng
Jinchao Zheng
Liang Zheng
Ran Zheng
shanxi zheng
Xu-Chang Zheng
Wang Zhile
Weicai Zhong
Yi-Ming Zhong
Chen Zhou
Daicui Zhou
Jianxin Zhou
Jing Zhou
Na Zhou
Qi-Dong Zhou
Shiyu Zhou
Shun Zhou
Sihong Zhou
Xiang Zhou
Xingyu Zhou
Yang Zhou
Yong Zhou
Yu-Feng Zhou
Zusheng Zhou
Demin Zhou
Dechong Zhu
Hongbo Zhu
Huaxing Zhu
Jingya Zhu
Kai Zhu
Pengxuan Zhu
Ruilin Zhu
Xianglei Zhu
Yingshun Zhu
Yongfeng Zhu
Xiao Zhuang
Xuai Zhuang
Mikhail Zobov
Zhanguo Zong
Cong Zou
Hongying Zou
Retrieving Signals with Deep Complex Extractors
Ousmane Dia
Mirco Ravanaelli
Christopher Pal
Recent advances have made it possible to create deep complex-valued neural networks. Despite this progress, many challenging learning tasks … (see more)have yet to leverage the power of complex representations. Building on recent advances, we propose a new deep complex-valued method for signal retrieval and extraction in the frequency domain. As a case study, we perform audio source separation in the Fourier domain. Our new method takes advantage of the convolution theorem which states that the Fourier transform of two convolved signals is the elementwise product of their Fourier transforms. Our novel method is based on a complex-valued version of Feature-Wise Linear Modulation (FiLM) and serves as the keystone of our proposed signal extraction method. We also introduce a new and explicit amplitude and phase-aware loss, which is scale and time invariant, taking into account the complex-valued components of the spectrogram. Using the Wall Street Journal Dataset, we compared our phase-aware loss to several others that operate both in the time and frequency domains and demonstrate the effectiveness of our proposed signal extraction method and proposed loss.
Quaternion Convolutional Neural Networks for End-to-End Automatic Speech Recognition
Recently, the connectionist temporal classification (CTC) model coupled with recurrent (RNN) or convolutional neural networks (CNN), made it… (see more) easier to train speech recognition systems in an end-to-end fashion. However in real-valued models, time frame components such as mel-filter-bank energies and the cepstral coefficients obtained from them, together with their first and second order derivatives, are processed as individual elements, while a natural alternative is to process such components as composed entities. We propose to group such elements in the form of quaternions and to process these quaternions using the established quaternion algebra. Quaternion numbers and quaternion neural networks have shown their efficiency to process multidimensional inputs as entities, to encode internal dependencies, and to solve many tasks with less learning parameters than real-valued models. This paper proposes to integrate multiple feature views in quaternion-valued convolutional neural network (QCNN), to be used for sequence-to-sequence mapping with the CTC model. Promising results are reported using simple QCNNs in phoneme recognition experiments with the TIMIT corpus. More precisely, QCNNs obtain a lower phoneme error rate (PER) with less learning parameters than a competing model based on real-valued CNNs.
Fashion-Gen: The Generative Fashion Dataset and Challenge
Seyedarian Hosseini
Thomas Boquet
Wojciech Stokowiec
Christopher Pal
We introduce a new dataset of 293,008 high definition (1360 x 1360 pixels) fashion images paired with item descriptions provided by professi… (see more)onal stylists. Each item is photographed from a variety of angles. We provide baseline results on 1) high-resolution image generation, and 2) image generation conditioned on the given text descriptions. We invite the community to improve upon these baselines. In this paper, we also outline the details of a challenge that we are launching based upon this dataset.
Towards End-to-End Speech Recognition with Deep Convolutional Neural Networks
Convolutional Neural Networks (CNNs) are effective models for reducing spectral variations and modeling spectral correlations in acoustic fe… (see more)atures for automatic speech recognition (ASR). Hybrid speech recognition systems incorporating CNNs with Hidden Markov Models/Gaussian Mixture Models (HMMs/GMMs) have achieved the state-of-the-art in various benchmarks. Meanwhile, Connectionist Temporal Classification (CTC) with Recurrent Neural Networks (RNNs), which is proposed for labeling unsegmented sequences, makes it feasible to train an end-to-end speech recognition system instead of hybrid settings. However, RNNs are computationally expensive and sometimes difficult to train. In this paper, inspired by the advantages of both CNNs and the CTC approach, we propose an end-to-end speech framework for sequence labeling, by combining hierarchical CNNs with CTC directly without recurrent connections. By evaluating the approach on the TIMIT phoneme recognition task, we show that the proposed model is not only computationally efficient, but also competitive with the existing baseline systems. Moreover, we argue that CNNs have the capability to model temporal correlations with appropriate context information.
Professor Forcing: A New Algorithm for Training Recurrent Networks
The Teacher Forcing algorithm trains recurrent networks by supplying observed sequence values as inputs during training and using the networ… (see more)k’s own one-step-ahead predictions to do multi-step sampling. We introduce the Professor Forcing algorithm, which uses adversarial domain adaptation to encourage the dynamics of the recurrent network to be the same when training the network and when sampling from the network over multiple time steps. We apply Professor Forcing to language modeling, vocal synthesis on raw waveforms, handwriting generation, and image generation. Empirically we find that Professor Forcing acts as a regularizer, improving test likelihood on character level Penn Treebank and sequential MNIST. We also find that the model qualitatively improves samples, especially when sampling for a large number of time steps. This is supported by human evaluation of sample quality. Trade-offs between Professor Forcing and Scheduled Sampling are discussed. We produce T-SNEs showing that Professor Forcing successfully makes the dynamics of the network during training and sampling more similar.
Theano: A Python framework for fast computation of mathematical expressions
Rami Al-Rfou
Amjad Almahairi
Christof Angermueller
Frédéric Bastien
Justin Bayer
Anatoly Belikov
Alexander Belopolsky
Josh Bleecher Snyder
Pierre-Luc Carrier
Paul Christiano
Myriam Côté
Yann N. Dauphin
Julien Demouth
Sander Dieleman
Ziye Fan
Mathieu Germain
Matt Graham
Balázs Hidasi
Arjun Jain
Kai Jia
Mikhail Korobov
Vivek Kulkarni
Pascal Lamblin
Eric Larsen
Sean Lee
Simon Lefrancois
Jesse A. Livezey
Cory Lorenz
Jeremiah Lowin
Qianli Ma
Robert T. McGibbon
Mehdi Mirza
Alberto Orlandi
Christopher Pal
Colin Raffel
Daniel Renshaw
Matthew Rocklin
Adriana Romero
Markus Roth
Peter Sadowski
John Salvatier
Jan Schlüter
John Schulman
Gabriel Schwartz
Iulian Vlad Serban
Samira Shabanian
Sigurd Spieckermann
S. Ramana Subramanyam
Gijs van Tulder
Sebastian Urban
Dustin J. Webb
Matthew Willson
Lijun Xue
Theano is a Python library that allows to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficie… (see more)ntly. Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements. Theano is being actively and continuously developed since 2008, multiple frameworks have been built on top of it and it has been used to produce many state-of-the-art machine learning models. The present article is structured as follows. Section I provides an overview of the Theano software and its community. Section II presents the principal features of Theano and how to use them, and compares them with other similar projects. Section III focuses on recently-introduced functionalities and improvements. Section IV compares the performance of Theano against Torch7 and TensorFlow on several machine learning models. Section V discusses current limitations of Theano and potential ways of improving it.