Cathy O’Neil is the author of the New York Times bestselling Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy, which was also a semifinalist for the National Book Award. She is a columnist for Bloomberg View and founded the company ORCAA, an algorithmic auditing company.
She earned a Ph.D. in math from Harvard, was a postdoctoral fellow in the MIT math department, and a professor at Barnard College where she published a number of research papers in arithmetic algebraic geometry. She then switched over to the private sector, working as a quantitative analyst for the hedge fund D.E. Shaw in the middle of the credit crisis, and then for RiskMetrics, a risk software company that assesses risk for the holdings of hedge funds and banks. She left finance in 2011 and started working as a data scientist in the New York start-up scene, building models that predicted people’s purchases and clicks.
Cathy wrote Doing Data Science in 2013 and launched the Lede Program in Data Journalism at Columbia in 2014. She is a columnist for Bloomberg View.
As chief marketing officer, Mick leads Cloudera’s worldwide marketing efforts, including advertising, brand, communications, demand, partner, solutions, and web. Mick has had a successful 25-year career in enterprise and cloud software. Prior to joining Cloudera in 2016, he served as CMO of sales acceleration and machine learning company InsideSales.com. Under Mick’s leadership, InsideSales pioneered a shift to data-driven marketing and sales that has served as a model for organizations around the globe. Previous to InsideSales, Mick served as global vice president of marketing and strategy at Citrix, where he led the company’s push into the high-growth desktop virtualization market. Before Citrix, Mick managed executive marketing at Microsoft and held numerous leadership positions at IBM Software. Mick is an advisory board member for InsideSales and a contributing author on Inc.com. He is also an accomplished public speaker who has shared his insightful messages about the business impact of technology with audiences around the world. Mick graduated from the Georgia Institute of Technology, with a bachelor’s of science degree in management.
Nick Psaki is the Principal, Office of the CTO for Pure Storage Federal and is based in the Washington, DC area. Nick is Pure Storage's senior technical resource for Federal customers, providing deep technical knowledge of flash storage system architectures that enable business and technological transformation for government enterprises.
A 20-year veteran of the United States Army, Nick has extensive experience in designing, developing, deploying and operating information systems for data analysis, sensor integration, and large-scale server virtualization. He was the Intelligence Architectures Chief for the Army G2 (Intelligence), and the Technology and Integration Director for Army G2 Futures Directorate. He has served in multiple peacekeeping and combat operations ranging from the Balkans in the 1990s (Operation Able Sentry VI and Operation Joint Endeavor/Joint Guard) to Iraq and Afghanistan in the post-9/11 era. For the past several years, Nick has been focused on ways in which new and emerging technologies can enable more rapid and cost-efficient analysis of ever-growing bodies of data.
Nick's Thought Leadership/Public Examples and links:
Hilary is general manager, machine learning, at Cloudera. She was the founder and CEO of Fast Forward Labs, an applied machine learning research company that Cloudera acquired in 2017. She also serves as data scientist in residence at Accel Partners, a leading global venture capital firm. Previously, Hilary was chief scientist at bitly. She co-hosts DataGotham, a conference for New York's home-grown data community, and co-founded HackNY, a non-profit that helps engineering students find opportunities in New York's creative technical economy. She is on the board of the Anita Borg Institute and an advisor to several companies, including Sparkfun Electronics and Wonder. Hilary served on Mayor Bloomberg’s Technology Advisory Board and is a member of the Brooklyn hacker collective NYC Resistor.
Highly efficient and results-oriented data scientist with strong quantitative skills, development experience and strong education background with a MSc (Imperial College London (World Rank within Top 10 QS)). Responsible self-starter with demonstrated experience in statistical programming language (R, Python, SAS) and programming language python for API’s. High ability holder on visualization with tools such as Tableau as well as good understanding of relational database such as SQL and oracle and non-relational database such as hbase, mongoDB and redis. Machine learning tools such as Hadoop, Spark, H2O, sparkling-water, pysparkling, SAS etc. as well as deep learning tools such as Keras, Tensorflow, Theano, MXnet, PyTorch. GPU cuda programming. Scaling data science. Expert in Predictive Modeling such as XGBoost, regression, Logit, Probit, GBM, RandomForest, Neural Network (generative model, GAN, VAE, RNN, CNN, word2vec etc.) , Naive Bays, K-nearest learn, PCA etc. (supervised learning, unsupervised learning, semi-supervised learning , reinforcement learning etc.) and also probabilistic modeling (PyMC3, Edward, Pyro) such as MCMC, HMC, NUTS, bayesian linear regression, variational models etc, Data mining skills such as parsing, nlp (natural language processing) and proficient in language modeling such as topic model, text clustering, word embedding, Word2Vec, Glove, text classification, RNN, Convolutional RNN etc. familiar with all the development environment such as Hadoop, Cloud (AWS, GCP, Azure) , GPU, Spark. etc.
Strong communication and relationship-building skills with diverse parties; fluent in English and Korean
Product Management lead at Uber with a focus on Data Platforms and Infra. I manage Uber's Storage, Analytics, BI, and Machine Learning product lines.
Tim Spann was a Senior Solutions Architect at AirisData working with Apache Spark and Machine Learning. Previously he was a Senior Software Engineer at SecurityScorecard ("http://securityscorecard.com/) helping to build a reactive platform for monitoring real-time 3rd party vendor security risk in Java and Scala. Before that he was a Senior Field Engineer for Pivotal focusing on CloudFoundry, HAWQ and Big Data. He is an avid blogger and the Big Data Zone Leader for Dzone (https://dzone.com/users/297029/bunkertor.html).
He runs the the very successful Future of Data Princeton meetup with over 1192 members at http://www.meetup.com/futureofdata-princeton/.
He is currently a Senior Solutions Engineer at Cloudera in the Princeton New Jersey area.
You can find all the source and material behind his talks at his Github and Community blog:
Mehul is the founder of Infinity Services Inc, a blockchain development company focused on 'Blockchain for Business' solutions. Mehul is also the inventor of Zippy Logic, a product that implements Fuzzy Logic based AI without the need for developers.
Mehul has 20+ years of hands-on and management experience in implementing complex IT projects across global markets including North America, India, UK and Australia.
Mehul is also a top 5 winner among 1000+ participants in IBM’s blockchain competition
David is a Director of Solution Architecture at Streamlio, and also a contributor to the Apache NiFi, and Apache Pulsar projects. He was formerly the Practice Director at Hortonworks, where he was responsible for the development of best practices and solutions for the professional services team, with a focus on HDF-related technologies including Kafka, NiFi, and Storm. He is a co-author of “Practical Hive: A Guide to Hadoop’s Data Warehouse System”, and holds a B.S and Master’s Degree in Computer Science from Kent State University.
Akshitha Ramachandran is a junior at Harvard University pursuing a joint degree in both Computer Science and Statistics. She was a founding member and Lead Engineer at Harvard Student Agencies - DEV, a start-up focused on developing mobile and web applications for third party clients. She is both a senior developer and board member at ProMazo, a campus organization partnering top students from leading universities with projects at leading companies ranging from Unilever to Whirlpool. She is on the board of the Harvard College Consulting Group, where as a Vice Director of Consulting she is responsible for the acquisition, organization and execution of eighteen client projects every semester. She directly manages 7 board members who collectively oversee more than 100 members of the organization. Additionally, Akshitha attends Hackathons and has been on the board of Harvard’s Women Engineers Code (WECode) conference.This past summer she spent time at Novetta expanding their Machine Learning practices, specifically in the Named Entity Resolution space. She has contributed to the company’s internal pipeline, designed a demo for them, and published some of her work (https://www.novetta.com/2018/09/named-entity-recognition-and-graph-visualization/ and https://www.novetta.com/2018/08/evaluating-solutions-for-named-entity-recognition/).
Don Bosco Durai (Bosco) is a thought leader in enterprise security and is a committer in open source projects like Apache Ranger, Apache Ambari, and Apache HAWQ. He has also contributed towards the security for most of the Hadoop components. Bosco was the co-founder of XA Secure, which is the genesis of Apache Ranger. Bosco is currently the co-founder of Privacera where he is tackling the data security challenges in modern data architecture, like Big Data and Cloud, where large data set constantly moves between different environments, which can result major security breaches or compliance violation if not managed properly. Privacera automates discovery of sensitive data, does transparent encryption/anonymization, manages access policies and monitors access.
Madhan Neethiraj is an Apache committer and PMC for Apache Atlas and Apache Ranger projects. He works at Hortonworks as Sr. Director of Engineering in Enterprise Security Team. His contributions include Apache Ranger features like audit framework, stack model, tag-based policies, masking and row-filter policies; and Apache Atlas features like V2 APIs, search enhancements. Prior to Hortonworks, Madhan was at Oracle in development of security access management suite, governance and real-time fraud detection/prevention products. Prior to Oracle, he was with Bharosa Inc. responsible for the development of real-time fraud detection solution for Financial Institutes, HealthCare and eCommerce.
Dave is an Enterprise Software Architect with over twenty-five years of technical leadership in the telecommunications, financial services, and healthcare domains. Dave’s diverse experience ranges from engineering event-based and rule processing systems at “PaaS” (Platform as a Service) scale to building an autonomous-agent workplace simulation engine. At Comcast, Dave is leading the end-to-end ingest, compute, and machine learning pipeline architectures for supporting Customer Experience Big Data applications.
Jeff is a software engineer and cloud architect. He is a committer and PMC on Apache OpenNLP. Jeff currently works on natural language processing pipeline projects and resides outside of Morgantown, WV.
At Partners & Co., Eric Wolok specializes in the sale of commercial real estate in Chicago. Partners & Co. uses open source tools such as emacs, sed, awk, Apache NiFi and Apache Spark to identify, track and facilitate unique investment opportunities for their clients.
Data science expert and software system architect with expertise in machine-learning and big-data systems. Rich experiences of leading innovation projects and R&D activities to promote data science best practice within large organizations. Deep domain knowledge on various vertical use cases (e.g., Finance, Telco, Healthcare). Currently working pushing the cutting-edge application of AI at the intersection of high-performance database and IoT, focusing on unleashing the value of spatial-temporal data. I am also a frequent speaker at various technology conferences, including: O’Reilly Strata AI Conference, NVidia GPU Technology Conference, Hadoop Summit, DataWorks Summit, Amazon AWS re:Invent, Global Big Data Conference, Global AI Conference, World IoT Expo, Intel Partner Summit, presenting keynote talks and sharing technology leadership thoughts.
Received my Ph.D. from the Department of Computer and Information Science (CIS), University of Pennsylvania, under the advisory of Professor Insup Lee (ACM Fellow, IEEE Fellow). Published and presented research paper and posters at many top-tier conferences and journals, including: ACM Computing Surveys, ACSAC, CEAS, EuroSec, FGCS, HiCoNS, HSCC, IEEE Systems Journal, MASHUPS, PST, SSS, TRUST, and WiVeC. Served as reviewers for many highly reputable international journals and conferences.
Dr. Sanjian Chen is a data science expert with deep knowledge in scalable machine learning algorithms. He has developed cutting-edge data-driven modeling techniques and autonomous systems in both academic and industry settings. He designed data-analytics solutions that drove numerous high-impact business decisions for multiple Fortune 500 companies across several industries, including retail, banking, automotive, and telecommunications. He is currently working on building cutting-edge cloud-based AI engines for high-performance distributed database systems that support scalable data analytics in multiple business areas. Dr. Chen is a frequent invited speaker at top international conferences, including the Strata Data Conference (San Francisco, London), the IEEE Cyber-Physical Systems Week (Chicago), the IFAC conference on Analysis and Design of Hybrid Systems (Atlanta), and IEEE International Conference on Healthcare Informatics (Philadelphia, Dallas).
Dr. Chen received his Ph.D. in Computer and Information Science at the University of Pennsylvania. He received two IEEE Best Paper Awards (IEEE RTSS 2012 and IEEE ISORC 2018). He has published over 25 papers in top journals and conferences, including 2 articles published in the Proceedings of IEEE (IF=9.1). He has served as an invited reviewer for numerous top international journals and conferences, e.g., the IEEE Design & Test, IEEE Transactions on Computers, ACM Transactions on Cyber-Physical Systems, IEEE Transactions on Industrial Electronics, IEEE RTSS conferences, and ACM HSCC conference.
Solutions Engineer with Hortonworks for almost 5 years serving the Federal sector with a heavy emphasis on DoD customers.
Sridhar is an Enterprise Architect delivering high impact IT solutions with cross functional executions. He comes with many years of applications programming in diverse industries including Retail, Healthcare, Manufacturing, Utilities and Telco. Stint includes building and managing operations for multi-tenant Hadoop clusters consisting over 500 nodes and growing, where he focuses on optimized and stable clusters, proactive maintenance and efficient operations.
I am currently working for Hortonworks as Senior Software Engineer focused on data management products. Actively contributing to the Hortonworks DataPlane Services platform and Hortonworks Data Lifecycle Manager. Prior to Hortonworks, I worked at Informatica in the Intelligent data warehouse and big data platform using Hadoop, Hive, and Teradata connectors. Prior to Informatica, I worked at Teradata in Data Movement products such as Teradata Parallel Transporter and Teradata connector for Hadoop.
Niru Anisetti is the product manager for Data Lifecycle Manager at Hortonworks. She is part of a passionate team building the next generation disaster recovery product to make millions of data managers’ lives easier. Before Hortonworks, she worked at IBM, Intuit and Yahoo among other companies to build products to not only generate revenues but to change lives of people for the better. She can be reached at firstname.lastname@example.org.
Lokesh Jain is a software engineer at Hortonworks. He has completed B.E.(Hons.) Computer Science and M.Sc.(Hons.) Mathematics from BITS Pilani. He is one of the early developers of Apache Ratis project and also contributes to Apache Hadoop. He also worked on GSOC project for SageMath organisation in 2017.
Experienced Software development professional with a strong exposure in various big data technologies. Skilled in Hadoop eco system components(HDFS, MapReduce, Pig, Hive, SQOOP), Cassandra, Spark, Core Java, Scala, Relational databases and Data warehousing, and also possess good skills in various SDLC methodologies.
Experienced software development professional with a strong exposure across big data technologies including HDF, Spark, HBase, Pheonix and expertise in building end to end Datawarehousing/BI solutions
Kamil is a technology leader in the large scale data warehousing and analytics space. He is CTO of Starburst, the enterprise Presto company. Prior to co-founding Starburst, Kamil was the Chief Architect at the Teradata Center for Hadoop in Boston, focusing on the open source SQL engine Presto. Previously, he was the co-founder and chief software architect of Hadapt, the first SQL-on-Hadoop company, acquired by Teradata in 2014.
Kamil began his journey with Hadoop and modern MPP SQL architectures about 10 years ago during a doctoral program at Yale University where he co-invented HadoopDB, the original foundation of Hadapt’s technology.
Kamil holds an M.S. in Computer Science from Wroclaw University of Technology and as well as M.S. and an M.Phil. in Computer Science from Yale University.
Dipti Shankar is a Ph.D. Candidate at the Department of Computer Science and Engineering at The Ohio State University. She is currently a Graduate Research Associate at the Network-Based Computing Lab (NOWLAB) working under Dr. Dhabaleswar K. (DK) Panda and Dr. Xiaoyi Lu. Her research interests include high-performance networking and storage media for Big Data middleware, including, Remote Direct Memory Access (RDMA) aware, non-volatile memory technologies, and memory-centric storage systems. At NOWLAB, she has been working on assisting with the research and development of RDMA-based accelerations for Apache Spark, Apache Hadoop, and Memcached, which are publicly available at (http://hibd.cse.ohio-state.edu). More details about Dipti are available at http://web.cse.ohio-state.edu/~shankar.50/.
Carolyn Duby is a Solutions Engineer and Cyber Security SME at Hortonworks, where she helps customers harness the power of their data with Apache open source platforms. Previously, she was the architect for cybersecurity event correlation at SecureWorks. A subject-matter expert in cybersecurity and data science, Carolyn is an active leader in the community and frequent speaker at Future of Data meetups in Boston, MA, and Providence, RI, and at conferences such as Strata Data Conference, Dataworks Summit, Open Data Science Conference and Global Data Science Conference. Carolyn holds an ScB (magna cum laude) and ScM from Brown University, both in computer science. She is lifelong learner and recently completed the Johns Hopkins University Coursera Data Science Specialization.
Terry Padgett is an accomplished Hadoop Systems Architect, with over 8 years of hands-on installation, integration and development with Hadoop technologies. Terry also has extensive experience in the development and application of advanced information technologies, providing software project leadership, software architecture development and assisting the customer in the application of technologies to provide capabilities and solve pressing problems. A seasoned technical lead and software developer, Terry is experienced with multiple programming languages, among them Java and C, with application throughout the entire software development lifecycle.
I am currently a Engineering Manager at Uber where I am a member of the Hadoop Platform team working on large scale data ingestion and dispersal pipelines and libraries leveraging Apache Spark. I was also previously the tech lead on the metrics team at Uber Maps building data pipelines to produce metrics to help analyze the quality of our mapping data. Before joining Uber, I worked at Twitter as an original member of the Core Storage team building Manhattan, a key/value store powering Twitter's use cases. I love learning anything about storage and data platforms and distributed systems at scale.
Dr. Alex Xiaoyang Yang is the CTO and Chief Architect of IBM China Development Laboratory.
He has extensive experience with big data analytics in FSS, Transportation, and Telecom.
Surekha Saharan is a Druid Committer and Software Engineer at Imply. Previously, she has worked at cloud startup and Cisco Systems where she prototyped, architected and implemented large scale systems. She holds a MS in Computer Science from University of Southern California and BS in Computer Engineering from National Institute of Technology, India.
Benjamin Hopp has been involved in architecting big data and streaming data solutions for companies of all sizes. Currently, he is a Solutions Architect with Imply where he assists organizations to deploy and manage Apache Druid solutions. Previously, he worked as a Senior Systems Architect with Hortonworks specializing in streaming data use-cases using HDF and Apache NiFi.
Tijo is an accomplished Hadoop Expert, with over 6 years of hands-on development with Hadoop and Streaming technologies, and has over 15 years of Software Industry experience. Primarily worked with Hadoop developments related to scalability, performance, load balancing, failover, and fault tolerance improvements and solutions. Having 5-year experience in batch processing (HDFS, Yarn, Spark, and Hive) and over 3 years of experience in Apache NiFi , Ranger and Atlas. Have good exposure to handling Architecture and Design for bigdata involved solutions and POCs. Exposed to cluster operation and management systems for large scale Hadoop clusters
Barbara Eckman is a Senior Principal Software Architect at Comcast and a recognized innovator in Big Data architecture and governance. She leads data discovery and lineage platform architecture for a division-wide initiative comprising streaming, transforming, storing, and analyzing Big Data. Barbara is also the Lead Metadata Architect for the Comcast Privacy Program, an initiative tackling the challenge of legislation like the California Consumer Privacy Act. Her prior experience includes scientific data and model integration at the Human Genome Project, Merck, GlaxoSmithKline, and IBM, where she served on the peer-elected IBM Academy of Technology.
Dr. Zhong Wang is a career computational biologist and group leader for genome analysis at DOE Joint Genome Institute (JGI); he is also an adjunct professor at University of California at Merced. He received his Ph.D. in Cell Biology from Duke University in 2004. He did his postdoc in the Institute of Genome Science and Policy at Duke University before becoming a research scientist at Yale University in 2008. He joined DOE Joint Genome Institute in 2009 and established his independent research in transcriptomics, metagenomics, and big data analytics. Dr. Wang published over 30 high-quality papers including several on Science and Nature. More information about his research can be found at http://jgi.doe.gov/our-science/scientists-jgi/genome-analysis/
Accomplished product owner with multiple years of professional experience working with leading-edge technologies supporting a large landscape of business cases. Proven problem solving skills, best management practices & result-oriented decision making capabilities
I am a data enthusiast and currently working at Hortonworks as a Solutions Engineer in the United States SF bay area. I have immense fascination and equal amount of passion for anything related to data and cloud.
As a former retail and consumer goods executive and more recently as a business strategy consultant and solution provider, Brent has extensive experience working with a variety of retail and consumer goods companies to provide thought leadership and help them to align strategic business objectives with technology and analytic solutions to create a differentiated competitive advantage in the marketplace.
He has an extensive track record of imagining, designing and executing high impact business solutions, driving innovation and transformation for retail and consumer goods organizations. Brent is passionate about analytics, emerging technologies, consumer behavior, collaborative supply chains and retail transformation.
As General Manager of Retail and Consumer Goods Solutions at Hortonworks, Brent is responsible for driving the solution vision and go-to-market strategies with each segment. As industry leaders increasingly invest in Big Data Analytics to help drive transformation within their organizations,
Brent engages globally to share, discuss, provide keynote talks, and facilitated workshops to help define and create solutions to drive next-generation insights and positive business outcomes across the value chain.
I am a data scientist with Miner & Kasch, a data science consulting firm. I specialize in developing automated solutions for our clients using machine learning, specifically in the domains of computer vision and natural language processing. Additionally, I lead the deep learning training sessions that Miner and Kasch holds.
Across a variety of domains I have successfully applied deep learning to computer vision problems involving image classification, object detection and segmentation. For Natural Language Processing tasks I have created neural information retrieval systems, semantic similarity search engines, and question answering systems. My favorite machine learning techniques are representation learning methods that result in surprising and useful latent variables that facilitate higher level tasks.
Yanbo is a staff software engineer at Hortonworks. He is working on the intersection of system and algorithm for machine learning and deep learning. He is an Apache Spark PMC member and contributes to several open source projects such as TensorFlow, Keras and XGBoost. He delivered the implementation of some major Spark MLlib algorithms. Prior to Hortonworks, he was a software engineer at Yahoo! and France Telecom working on machine learning and distributed system.
Nitin Khandelwal is working at Qubole as a Staff Engineer. He has worked in a different arena of projects like adding encrypted communication for ephemeral clusters nodes running in the cloud, providing Hive as a multi-tenant service, Autoscaling, etc. He has been contributing significantly in optimizing Tez engine for ETL workloads by adding features like workload-aware autoscaling, fault-tolerance, effective use of spot nodes, etc.
Previously, Nitin was working with Microsoft on VPN Site-to-site gateway service which forms the backbone of Microsoft Azure Stack's network.
Nitin has completed his Masters in Computer Science from IIIT-Hyderabad. His main areas of focus there were distributed computing, databases and networks.
Shreya Bhatia is working in Qubole as a Member of Technical Staff. She works there on Hive Stack, and has been part of projects like providing Hive as a service on a cloud agnostic platform, building Metrics and alerting solution for HiveServer2 and stabilizing it under a highly concurrent load, performance analysis of MapReduce on Yarn in the Qubole Stack etc.
She completed here Masters in Computer Science from Stony Brook University, New York in 2016. Previously she was working in India with InfoEdge (Naukri.com) as part of Search Team and worked on building extraction systems like Resume/Email parser, Job Crawler etc.
Ian Brooks holds a Ph.D. in Computer Science from University of North Texas, and his dissertation focused on virtual teams, leadership, and predictive analytics. He is committed to improving his craft, and he has a great passion for science, data, and computing. Currently, Ian is a member of the Public Sector team at Hortonworks, and he recently relocated to Washington DC. When he isn't stressing over the details, Ian enjoys mountain biking, kettlebells, and beer making.
Pradeep is a Senior Big Data Engineer at Hotels.com in London where he builds and manages cloud infrastructure and core services like Apiary. Pradeep has worked in the big data space for the last 7 years, building large scale platforms.
Elliot is a principal engineer at Hotels.com in London where he designs tooling and platforms in the big data space. Prior to this Elliot worked in Last.fm’s data team, developing services for managing large volumes of music metadata.
Kai Liu is a Senior Program Manager in AI and Research group of Microsoft. He has 8 years of experience in data driven engineering, big data platform and AI infrastructure for Office and Bing product families. He led his team to create a service health portal for SharePoint Online, inject a distributed log collection and storage system for Exchange Online, publish curated data sets, key business metrics, and enable sub-hour experimentations in Office 365.
Currently he is working on the next generation of Big Data and Deep Learning platform for Bing based on Open Source technologies.
Sanjeev Koranga is leading the PayPal’s instrumentation and analytics platform team. He is responsible for making PayPal’s behavioral analytics self-serve & designing and developing systems for turning data into meaningful insights.
Shobana is a Big Data enthusiast, passionate about data analytics and solving complex and interesting problems.
She leads the Tracking Platform team in PayPal which is responsible for collecting all the behavioral server side events of PayPal which is then enriched and processed for various downstream analytics.
She has experience building real time analytics platform that collects and processes more than 15 billion events a day using technology stacks that includes open source software's like Squbs, Kafka, Hadoop, Spark, Presto, Druid and Teradata.
Sunil Govindan is contributing to Apache Hadoop project since 2013 in various roles as Hadoop Contributor, Hadoop Committer and member Project Management Committee (PMC). He is working as Staff Software Engineer at Hortonworks in YARN team. He is majorly contributing in YARN Scheduling improvements such as Intra-Queue Resource preemption, Multiple Resource types support in YARN with Resource Profiles, Absolute Resource configuration support in Queues etc. He also drove efforts to improve YARN UI for better user experience with community. Before Hortonworks, he worked at Juniper on a custom resource scheduler. Prior to that, he was associated with Huawei and worked on Platform and Middleware distributed systems including Hadoop platform. He loves reading books, an ardent music lover and passionate about go-green efforts.
Weiwei is a Software Engineer from Cloudera, an Apache Hadoop committer and PMC member. He has been working on Hadoop for over 8 years and contributed to both HDFS and YARN. His work mainly includes some storage features in Ozone, and scheduling features like placement constraints, async scheduling, and CSI adoption, etc in YARN. He is now focused on adding scheduling features in Kubernetes, in order to support both batch and service workloads. Before Cloudera, he worked in Alibaba’s data infrastructure team, with experiences of evolving big data platform at 10k+ nodes scale. Prior to that, he worked in IBM for several years as one of the startup member of Biginsights project.
Leader of HDFS/ZooKeeper project at Xiaomi, focus on distributed filesystem. 6 years experience on large scale distributed storage system
Owen O'Malley is a co-founder and technical fellow at Hortonworks, a rapidly growing company (25 to 1,000 employees in 5 years), which develops the completely open source Hortonworks Data Platform (HDP). HDP includes Hadoop and the large ecosystem of big data tools that enterprises need for their data analytics. Owen has been working on Hadoop since the beginning of 2006 at Yahoo, was the first committer added to the project, and used Hadoop to set the Gray sort benchmark in 2008 and 2009. In the last 8 years, he has been the architect of MapReduce, Security, and now Hive. Recently he has been driving the development of the ORC file format and adding ACID transactions to Hive. Before working on Hadoop, he worked on Yahoo Search's WebMap project, which was the original motivation for Yahoo to work on Hadoop. Prior to Yahoo, he wandered between testing (UCI), static analysis (Reasoning), configuration management (Sun), and software model checking (NASA). He received his PhD in Software Engineering from University of California, Irvine.
スリカンス・ベンカットは、現在、HortonworksにてApache Knox、Apache Ranger、Apache Atlas、プラットフォーム ワイド セキュリティ、Hortonworks DataPlane Serviceを含む、製品のセキュリティ＆ガバナンスのポートフォリオに携わっています。Hortonworksに入社する以前は、クラウドサービス、市場、セキュリティ、ビジネスアプリケーションなどの分野で様々な職務の経験があります。スリカンスは、製品管理から、戦略および運営、テクニカルアーキテクチャまで様々な分野でリーダーシップの経験があり、TelefonicaやSalesforce、Cisco-Webex、Proofpoint、Dataguise、Trilogy Software、Hewlett-Packardを含む、新興企業からグローバル企業まで広範囲の職務経験を持ちます。スリカンスは、ピッツバーグ大学で人工知能に焦点を置いたエンジニアリングの博士号、インディアナ大学でGeneral ManagementのMBA、サンダーバード国際経営大学院にてグローバルマネジメントの修士号を取得しています。趣味はデータサイエンスと機械学習で、ビッグデータテクノロジーを触ることを楽しんでいます。
Naveen is an Engineering Manager with 7+ Years of Data Engineering, Data Science & Analytics experience across Retail, Finance & Marketing industries. In his current role at WalmartLabs, Naveen leads Walmart's Customer Experience and Store Marketing engineering team and one of his key initiatives in the last few months has been to bring all the various data assets at Walmart under one single data lake platform. He has worked across various database technologies throughout his career and has been extensively working on the entire Hadoop stack at WalmartLabs. Over the past few years, he led several teams building end to end data and visualization platforms and also worked on evaluating and implementing multiple query acceleration and SQL on Hadoop layers such as Druid, LLAP, Spark, Kinetica, SAP HANA etc. to power Walmart's BI platforms.
Speaking & Presentation Experience:
@NWA IISE conference: Topic: 'Data Cafe: Enabling Real Time Insights Through Visualization'
@Bentonville Data Science Meetup: ' Data Cafe: Ask Me Anything - Bot Framework using NLP'
Naveen's work at WalmartLabs was featured on Forbes as one of the "The Most Practical Big Data Use Cases Of 2016" https://www.forbes.com/sites/bernardmarr/2016/08/25/the-most-practical-big-data-use-cases-of-2016/#1a1206531625
This is Abhishek Gupta with around 4 years of professional experience in IT Industry, currently working in Walmart Labs as a Software Engineer 2. At Walmart, I am working in the Data Lake Initiative practicing principles of different pillar of data solutions such as Data Architecture, Data Engineering, Metadata Management & Data Governance. From tools & technlogies standpoint, I'm an active user of Hadoop, Hive, Spark, Springboot etc.
Prior to this, I had worked for more than 2 years in the area of data warehousing and business intelligence at AIG (American International Group). I have pursued my Master's in Management Information Systems with the specialization in Data Analytics from the University of Arizona.
I enjoy public speaking and I am starting to put my foot in the arena of knowledge sharing by public speaking, talks, sessions etc. Recently, I had the chance to be a host at the "Open Data Science Conference West 2018".
Senior Technologist at American Water working on HDP & HDF.
Experienced Technologist with a demonstrated history of working in the utility space.
Expert in Hadoop, Hive, Spark, and NiFi.
Notable expert clinical information systems specialist offering 25-plus years of strategic leadership. Successful architect of healthcare data warehouses, clinical and business intelligence tools, big data ecosystems, and a health information exchange.
Excel at leading the development of long-term systems strategy for major medical organizations and executing plans to select innovative technology, implement systems, and leverage and maximize system functionality to enhance the health care delivery process.
Evangelist for the use of clinical technology to drive daily operations, analysis, and decisioning. Reputation for building consensus among medical, nursing and research leadership, clinical departments, and IT.
Scope of expertise encompasses designing technology-enabled processes, leading the clinical model for transformation, coordinating clinical workflow, ensuring the use of standardized data elements in clinical systems to meet clinical and research data warehouse requirements, leading teams through development and launch, and directing training.
Charles Boicey MS, RN-BC is the chief innovation officer for Clearsense, an outcomes-driven healthcare technology company based in Jacksonville, FL. Previously, Charles was the enterprise analytics architect for Stony Brook Medicine, where he developed the analytics infrastructure to serve the clinical, operational, quality, and research needs of the organization. He was a founding member of the team that developed the Health and Human Services award-winning application NowTrending to assist in the early detection of disease outbreaks by utilizing social media feeds. Charles is a former president of the American Nursing Informatics Association.
Mayank Kejriwal is a research scientist at the University of Southern California's Information Sciences Institute (ISI), and a research assistant professor in the Department of Industrial and Systems Engineering. He received his Ph.D. from the University of Texas at Austin. His dissertation involved Web-scale data linking, and in addition to being published as a book, was recently recognized with an international Best Dissertation award in his field. His research is highly applied and sits at the intersection of knowledge graphs, social networks, Web semantics, network science, data integration and AI for social good. He has contributed to systems used by both DARPA and by law enforcement, and has active collaborations across academia and industry. He is currently co-authoring a textbook on knowledge graphs (MIT Press, 2018), and has delivered tutorials and demonstrations at numerous conferences and venues, including top academic venues such as KDD, AAAI, and ISWC, and industrial venues . He is currently serving as general chair of the ACM K-CAP conference in 2019, and is co-editing a special issue on knowledge graphs in the Semantic Web Journal. He was awarded a Key Scientific Challenges award in 2018 by the Allen Institute for Artificial Intelligence, and was recently named a Forbes Under 30 Scholar. He has also been nominated as a 2019 Forbes 30 Under 30 in the Science category.
Sridhar is a technology leader and currently responsible for building a Finance data lake in Walmart. He is Sr Manager II Engineering, Global Data Analytics Platform, Walmart. Before working in the Data Analytics area, Sridhar led multiple HR implementations in Walmart. Previously, he worked in Deloitte Consulting, Hyderabad.
Sridhar has 15+ years of IT experience in Retail, Healthcare and Finance domains.
@NWA Arkansas IISE Chapter Conference on Data Cafe: Enabling Real-Time Insights Through Visualization
Pardeep is a Senior Solutions Architect at Cloudera. He has worked in the Big Data space for 9 years in a broad variety of roles.
マイケル・ガーは、業界および情報テクノロジー戦略の担当者として25年の勤務経験があります。彼は商品開発、製造、サプライチェーンおよび顧客経験関連の事業プロセスに関して産業間共通の深い知識を持っています。Hortonworksの製造および自動車部門の部長として、マイクはソリューション ビジョンおよび各業界の市場開拓戦略の推進に貢献し、業界のリーダーと提携しビッグデータ分析を通して次世代の事業洞察を推進します。Hortonworksに入社する以前、マイクは、Oracleの自動車業界部門のリーダーとして20年以上勤務し、A.T.カーニーにてオートモーティブマネジメント顧問、ジェネラルモータース (サターン部門) にて生産技師として勤務しました。
As VP of Industry Solutions, Cindy Maike is responsible for global industry solutions and customer engagement for Cloudera. She works with customers and partners leveraging analytics for current day business growth and exploring the use of new data sources to drive innovation in the evolving world of insurance. She has over 25 years of finance, consulting and advisory services experience in the insurance industry working with clients globally on their business strategy leveraging analytics and technology to further drive business results.
シンディは保険請求と引受の両方で深い業界知識を持ち、アナリティクスおよびデータを使用してビジネス成果を向上することに注力しています。IBM Watsonソリューショングループ、Carrier Insurance、ACORDの戦略部長での職務経験を持ち、Strategy Meets Action Research and Advisory Services の共同設立者でもあります。彼女はまた、公認会計士でもあります。
Sanjay is a telecom industry veteran with extensive experience in the strategy and execution of next generation data-centric industry solutions for enhancing customer experience, optimizing network operations and increasing revenue generation through digital transformation.
Sanjay currently leads the global communications & media business at Hortonworks helping communication service providers leverage Hadoop and NiFi to transform their data into a force of business growth and competitive differentiation and to drive data-centric solutions for the connected world & for Industrial IOT. Previously, he held executive roles, leading the global telecom industry business, solutions, and strategy at VMware, Pivotal, Progress Software, Savvion, and TMNG and has help drive business transformation, end-to-end architecture and new business initiatives at Bell Canada, Level3, AT&T Canada, Iowa Telecom, ETB, ATT/Ameritech, Wingcast, and other global service providers.
Paul Gibeault has a B.S. degree in computer science from the University of Idaho and has spent most of his career developing software frameworks and tools to enable the automation of semiconductor equipment. He is currently a Big Data Solution Architect with the IT group at the third largest memory manufacturer in the world, Micron Technology, headquartered in Boise, Idaho. For the past five years he has focused on the design and implementation of an automated approach to create Micron’s Global Data Warehouse.
Focused on architecting ingestion feeds through Apache NiFi and loading data into Hadoop and Teradata.
In my spare time I like to work network my house and build do fun things with IoT devices.
Lohit is part of Hadoop and Log Management team at Twitter. He has been concentrating on scaling Hadoop FileSystem, Hadoop Resource Manager, Log Ingestion and Processing pipelines at Twitter. Previously he has worked at few startups building scalable file systems and was also part of Hadoop team at Yahoo! when it was open sourced. He has Masters degree in Computer Science from Stony Brook University.
Vrushali Channapattan is an active Apache Hadoop Committer & PMC member who is currently working in the Hadoop team at Twitter focusing on ensuring that Hadoop can keep meeting the rapidly expanding storage and computation needs at Twitter. In past roles, she has also worked with Intuit, Yahoo!, Oracle, Persistent Systems and Tata Institute of Fundamental Research in India.
Love to learn. Learn to success.
Henry Sowell is Hortonworks Technical Director in the Public Sector.
In this capacity, Mr. Sowell leads an engineering group responsible for the technical architecture and engineering of Big Data solutions supporting missions across the Intelligence Community, Department of Defense, Federal Civilian Government Agencies, and State, Local, and Higher Education institutions, helping improve speed to mission.
Prior to joining Hortonworks, Mr. Sowell used several technologies, including Apache Hadoop, to protect the nation in support of the FBI’s counterterrorism mission. In addition to supporting the counterterrorism mission, he leveraged these technologies to support cross-division law enforcement advancements with the FBI’s Cyber Division. Mr. Sowell enlisted in the United States Marine Corps in 2003. He served with distinction as a decorated combat veteran, having earned the Bronze Star with Valor for his actions in Iraq.
Leo Garciga serves as the Joint Improvided-Threat Defeat Directorate (JD) Chief Technology Officer under the Defense Threat Reduction Agency (DTRA). In his role, he provides leadership and oversight of Mission Information Technology services and personnel that directly contribute to the implementation of the DTRA mission and its support to the warfighter, Department of Defense (DoD), Combatant Commanders, Coalition partners, the Intelligence & Interagency organizations.
Mr. Garciga is also DTRA JD senior information technology advisor, who discovers and rapidly implements new technology and innovation to counter threat networks, improvised threats and improvised explosive devices to support counter-terrorism and counter-insurgencies operations and to prevent battlefield surprise.
He advocates and spearheads efforts across DoD, the Intelligence Community, US Government Agencies, academia and industry to integrate a myriad of Research and Development work to rapidly introduce new information technology that provides immediate operational impacts for the warfighter and the nation. His efforts have resulted in continuous enhancements to Catapult, a rapid response data analytic platform, to improve situational awareness to thousands of users. He made JIDO (JD) an early adopter and leader in DoD of the implementation of Secure Dev Ops, which unified security, software development and operations to automate processes for innovation in information technology. He also is key contributor to DoD understanding of the potential of artificial intelligence and machine learning to future missions.
Mr. Garciga has a BA in Mechanical Engineering Technology, is a certified Information Technology professional. He has also served in a variety of roles in DoD, to include active duty service in the US Navy, the Combatant Commands, and the Intelligence Community.
Suresh Yadagotti Jayaram is the Senior IT Application Architect for Florida Blue, Florida’s Blue Cross and Blue Shield company, which is the largest health insurance provider in the state. His extensive experience includes software architecture and engineering leadership roles at multiple global firms including HP, PayPal, Tata, and Deloitte. Suresh is passionate about business intelligence and implementing business architecture to reflect strategies that support elite IT departments, regardless of industry. He holds a master’s degree in Innovations and Entrepreneurship from HEC, Paris.
Praveen Kanumarlapudi is a Lead Data Engineer with Aetna’s (a CVSHealth company) Global Security team. Prior to his time at Aetna, Praveen worked on big data solutions for Apple and Bank of America.
John has a degree in Physics from Rutgers University where he did various research in condense matter physics. He went on to become the Supervisor of Testing and Assembly for Leonardo Helipcopters (then AgustaWestland) Philadelphia. Recently John had various roles at American Water where he worked on emerging technology and then transitioned to data engineering where he was in charge of the HDF and HDP platform. Before joining Cloudera, John transitioned to the Autonomous Intelligence team where he was in charge of integrating the platforms to allow data scientists to work with various types of data. Currently, John works as a Solutions Engineer for Cloudera supporting the mid-atlantic region.
Tristan Zajonc is CTO for Machine Learning at Cloudera. Tristan previously led engineering for Cloudera Data Science Workbench and was the cofounder and CEO of Sense, an enterprise data science platform that was acquired by Cloudera in 2016. He has over 15 years experience in applied data science, machine learning, and machine learning systems development across academia and industry and holds a PhD from Harvard University.
Nisha Muktewar is a Research Engineer at Cloudera Fast Forward Labs, where she spends time researching latest ideas in machine learning, builds prototypes that showcase these capabilities when applied to real-world use cases, and advises clients in this space. Prior to joining Cloudera, she worked as a Manager in Deloitte’s Actuarial, Advanced Analytics & Modeling practice leading teams in designing, building, and implementing predictive modeling solutions for pricing, consumer behavior, marketing mix, and customer segmentation use cases for insurance and retail/consumer businesses. She holds a Bachelor of Engineering degree in computer science from University of Pune, India.
Sagar Kewalramani is a Strategic Solution Architect & Data Scientist at Cloudera, where he helps Customers Install, Build, Secure, Optimize & tune their Hadoop clusters. He also helps new customers transition to Hadoop platform and implement their initial use cases. Sagar has worked with customers from all verticals, including Banking, Manufacturing, Healthcare, Retail etc. He has wide experience in building business use cases, high volume real-time data ingestion, transformation and movement, and data lineage and discovery. He has led the discovery and development of big data and machine-learning applications to accelerate digital business and simplify data management and analytics. He has spoken in multiple Hadoop & Big Data Conferences including Oreilly Strata. Previously, he was an Data Architect at Meijer Inc. where he was primary focused in Architecture Design and Administration roles for ETL tools and databases including Teradata.
Justin leads Cloudera's Fast Forward Labs team. Justin is a career data professional and Data Science leader with experience in multiple industries and companies. Previously, Justin was the head of Applied Machine Learning at Fitbit, the head of Cisco’s Enterprise Data Science Office and a Big Data Systems Engineer with Booz Allen Hamilton after serving as a Marine Corps Officer, with a focus in Systems Analytics and Device Intelligence. Justin is a graduate of the US Naval Academy with a degree in Computer Science and the University of Southern California with a Master’s Degree in Business Administration and Business Analytics.
Alice Albrecht leads our strategic engagements and advising at Cloudera Fast Forward Labs. She is passionate about helping organizations see a return on their investment in data and helping them build the future. Previously she was a research engineer at Cloudera Fast Forward Labs where she spent her days researching the latest and greatest in machine learning and artificial intelligence and bringing that knowledge to working prototypes and delivering concrete advice for clients. Prior to joining Cloudera, Alice worked in both finance and technology companies as a practicing data scientist, data science leader, and a data product manager. In addition to helping organizations harness the power of machine learning, Alice is passionate about mentoring and helping others grow in their careers. Alice holds a PhD from Yale in cognitive neuroscience where she studied how humans summarize sensory information from the world around them.
Varun Gupta is a Technology executive with extensive IT Design, Strategy and Leadership experience in delivering business solutions to improve performance, reduce operational expenses and promote organizations growth. Focus areas include Information / Data Architecture and Strategy, Machine learning, Business Intelligence, Digital Solutions, Cloud Frameworks, Database Development and Implementation. Proficient in assisting organizations through Analytics and Digital transformations and setting up Big Data ecosystems and Advanced Analytics platforms for desired outcomes. Worked with Fortune 500 companies and has experience in leading and mentoring teams of Designers and Architects throughout the complete Software Development Life Cycle.
Experienced in partnering with Data, Analytics and Digital technology vendors and Technology teams to evaluate Innovative solutions and build data infrastructure and solutions in a phased manner aligned with customers and the business needs.
He is an expert in diverse Healthcare IT solution working for both Provider and Payer organizations. Proficient in leading large teams constituting Program managers, Technical Architects and Leads, database engineers and designers.
Experienced in Designing state of the art Data as a service Frameworks via self-service API’s, Big Data on Hadoop ecosystem, citizen Business Intelligence solutions and Salesforce Platform.
Varun has a bachelors in Computer Engineering and a Masters in Data science and Program Management from UCONN
Ferd Scheepers is the Chief Information Architect of ING. Ferd has been driving ING’s journey to becoming a data driven company for the last 4 years. He has published on Data Lakes, and is a frequent speaker on both major vendor conferences, and on open source summits. Currently he is championing the Apache Atlas open metadata initiative. Passionate about data, both on the opportunities and the risks, Ferd loves to share his vision and ideas on what data will mean for both companies, and for individuals.
Robert is an AI evangelist at Cloudera and has over 12 years of experience working on various projects related to Artificial Intelligence, Robotics, IoT, Enterprise & Embedded Software. His primary focus at Cloudera is building communities around IoT, Big Data and Data Science, and enabling Enterprises to accelerate adoption of cutting edge open-source technologies (from Edge to AI).