Tweets
- Tweets, current page.
- Tweets & replies
- Media
You blocked @ApacheParquet
Are you sure you want to view these Tweets? Viewing Tweets won't unblock @ApacheParquet
-
Apache Parquet Retweeted
PSA: If you use the page-level statistics in
@ApacheParquet please chime in on JIRA: https://issues.apache.org/jira/browse/PARQUET-1365 …Thanks. Twitter will use this to make your timeline better. UndoUndo -
Apache Parquet Retweeted
Last speaker on the
#europython's scientific room before lunch is Peter Hoffmann talking about#Pandas and#Dask to work with large datasets in@ApacheParquet.pic.twitter.com/gwwrRrgkkb
Thanks. Twitter will use this to make your timeline better. UndoUndo -
Apache Parquet Retweeted
Have a look at the
@ApacheFlink bucketing sink rework for the upcoming release and the Parquet writer ;)Thanks. Twitter will use this to make your timeline better. UndoUndo -
Apache Parquet Retweeted
@StackOverflow@ApacheSpark Can someone answere this -> why is@ApacheParquet format faster than other columnar storage like hbase, kudu etc? https://stackoverflow.com/q/48761227/3185670?stw=2 …Thanks. Twitter will use this to make your timeline better. UndoUndo -
Apache Parquet Retweeted
My talk from the DMBI 2018 Conference at
@bengurionu about our journey at@Verint_Cyber to#BigData#Cyber Analytics on@hadoop@ApacheSpark@apachekafka@ApacheParquet is available at https://www.youtube.com/watch?v=nh-JyY6Wy4c … . Thanks everyone for attending!Thanks. Twitter will use this to make your timeline better. UndoUndo -
Apache Parquet Retweeted
How big
@CERN data is?? Well... after filtering the collisions, they generate 12.3 PB in a month... Special ROOT format +@ApacheParquet#JSON#avro#DWS18pic.twitter.com/mAVpz47Kif
Thanks. Twitter will use this to make your timeline better. UndoUndo -
Apache Parquet Retweeted
In one month from now I'll be speaking on
@Verint big data journey with@ApacheSpark@apachekafka and@ApacheParquet at the#StrataData Conference in London. If you're there, drop by!https://conferences.oreilly.com/strata/strata-eu/public/schedule/speaker/278192 …Thanks. Twitter will use this to make your timeline better. UndoUndo -
Apache Parquet Retweeted
2nd
#PyDataLDN#keynote -@holdenkarau &@BooProgrammer walk us through a zoo of#tools for#BigData &#distributed#data in#Python:#Apache#Spark,#PySpark,#Arrow,#Beam,#Parquet &#Dask@ApacheSpark@ApacheArrow@ApacheBeam@ApacheParquet@dask_dev#PyData@PyData@NumFOCUSpic.twitter.com/Tlunq778ha
Thanks. Twitter will use this to make your timeline better. UndoUndo -
Apache Parquet Retweeted
Join the
#GPU accelerated#analytics and#ML revolution.@ApacheArrow@ApacheParquet and@gpuoai#GTC18pic.twitter.com/bgzCJEt8Gm
Thanks. Twitter will use this to make your timeline better. UndoUndo -
Apache Parquet Retweeted
Great benchmark between
@ApacheParquet on#hdfs and@ApacheKudu https://blog.clairvoyantsoft.com/guide-to-using-apache-kudu-and-performance-comparison-with-hdfs-453c4b26554f … In short kudu is faster than Parquet for random access Querys like CRUD operations but slower for analytics queries.Thanks. Twitter will use this to make your timeline better. UndoUndo -
Apache Parquet Retweeted
If you’re a company using open source projects and not sure how to contribute, a release engineer would be a tremendous help. It’s hard to do this properly part time. I have a specific project in mind, if you need a hint.
Thanks. Twitter will use this to make your timeline better. UndoUndo -
Apache Parquet Retweeted
You do not need Spark to create
@ApacheParquet files, you can use plain Java and it can even fit in AWS Lambda for a serverless solution:https://engineering.opsgenie.com/analyzing-aws-vpc-flow-logs-using-apache-parquet-files-and-amazon-athena-27f8025371fa …Thanks. Twitter will use this to make your timeline better. UndoUndo -
Apache Parquet Retweeted
Learn how to use hive views for advanced schema evolution
#hive http://blog.nuvola-tech.com/2017/02/schema-evolution-with-hive-and-parquet-using-partitioned-views/ …Thanks. Twitter will use this to make your timeline better. UndoUndo -
Apache Parquet RetweetedThanks. Twitter will use this to make your timeline better. UndoUndo
-
Apache Parquet Retweeted
I'll be speaking at
#StrataData Conference this May in London, and share our journey in one of our many#BigData adventures with@ApacheSpark. You're all invited! https://conferences.oreilly.com/strata/strata-eu/public/schedule/speaker/278192 …@apachehadoop@apachekafka@ApacheParquet@strataconfThanks. Twitter will use this to make your timeline better. UndoUndo -
Apache Parquet RetweetedThanks. Twitter will use this to make your timeline better. UndoUndo
-
Apache Parquet Retweeted
Also the file size went down from 10Gigs to 3Gigs without any compression.
Show this threadThanks. Twitter will use this to make your timeline better. UndoUndo -
Apache Parquet Retweeted
Working with a 10Gig csv data. Pandas read_csv took 16mins to load the csv into memory. Converted to
@ApacheParquet with@ApacheArrow. It took 30 secs to read into pyarrow table and 16 sec to convert to pandas dataframe. 16mins => 46sec! https://tech.blue-yonder.com/efficient-dataframe-storage-with-apache-parquet/ …pic.twitter.com/nECwiWlhgL
Show this threadThanks. Twitter will use this to make your timeline better. UndoUndo -
Apache Parquet Retweeted
Come hear me talk about
@ApacheArrow and@ApacheParquet at#NABDConf in Palo Alto next Tuesday!https://twitter.com/jqcoffey/status/927859244912824321 …Thanks. Twitter will use this to make your timeline better. UndoUndo -
Apache Parquet Retweeted
At
@ucc_bdcat today in#Austin presenting our work with@pbr_wur on managing#agri#genomic#bigdata with@ApacheSpark and@ApacheParquetpic.twitter.com/ynOrgXac9n
Show this threadThanks. Twitter will use this to make your timeline better. UndoUndo
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.