Java Hadoop - Search News

A library for manipulating bioinformatics sequencing formats in Apache Spark.

This code grew out of, and was heavily inspired by, Hadoop-BAM and Spark-BAM. Spark-BAM has shown that reading BAMs for Spark can be both more correct and more performant than the Hadoop-BAM ...

GitHub

onefoursix/Cloudera-Impala-JDBC-Example

Apache Impala (Incubating) is an open source, analytic MPP database for Apache Hadoop. This example shows how to build and run a Maven-based project to execute SQL queries on Impala using JDBC This ...

IEEE

Analysis and performance improvement of K-means clustering in big data environment

Abstract: The big data environment is used to support the huge amount of data processing. In this environment tons (i.e. Giga bytes, Tera bytes) of data is processed. Therefore the various online ...

IEEE

SFSAN Approach for Solving the Problem of Small Files in Hadoop

Abstract: Hadoop is a distributed computing framework written in Java and used to deal with big data; it is designed to handle large files. Handling the small files leads to some problems in Hadoop ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results