| Author |
Message |
![]() ![]() ![]() ![]()
Siloan
Legend Username: Siloan
Post Number: 47992 Registered: 03-2008 Posted From: 98.31.1.121
Rating: N/A Votes: 0 (Vote!) | | Posted on Saturday, September 24, 2016 - 01:39 pm: |
![]() ![]() ![]() ![]() ![]() |
Gandhiguevara:
mail ettu |
![]() ![]() ![]() ![]()
Myselfme
Junior Artist Username: Myselfme
Post Number: 320 Registered: 04-2011 Posted From: 73.22.114.102
Rating: N/A Votes: 0 (Vote!) | | Posted on Saturday, September 24, 2016 - 01:38 pm: |
![]() ![]() ![]() ![]() ![]() |
Gandhiguevara:very expensive... open source options choosthunna
Try to use Kafka, its proven solution, latest version come swith Kafka connect whcih sinks data to HDFS |
![]() ![]() ![]() ![]()
Myselfme
Junior Artist Username: Myselfme
Post Number: 319 Registered: 04-2011 Posted From: 73.22.114.102
Rating: N/A Votes: 0 (Vote!) | | Posted on Saturday, September 24, 2016 - 01:33 pm: |
![]() ![]() ![]() ![]() ![]() |
Myselfme:denormalized data
I meant all are denormalized data |
![]() ![]() ![]() ![]()
Myselfme
Junior Artist Username: Myselfme
Post Number: 318 Registered: 04-2011 Posted From: 73.22.114.102
Rating: N/A Votes: 0 (Vote!) | | Posted on Saturday, September 24, 2016 - 01:32 pm: |
![]() ![]() ![]() ![]() ![]() |
Gandhiguevara:
Camus, just nothing but a map-reduce job which pulls data from Kafka and write to HDFS, Gobblin is more useful if you have data from multiple sources and it has better map distribution algorithm as it claims |
![]() ![]() ![]() ![]()
Myselfme
Junior Artist Username: Myselfme
Post Number: 317 Registered: 04-2011 Posted From: 73.22.114.102
Rating: N/A Votes: 0 (Vote!) | | Posted on Saturday, September 24, 2016 - 01:30 pm: |
![]() ![]() ![]() ![]() ![]() |
Gandhiguevara:
No relational (Big) data, denormalized data, but to understand this data there is business logic postgres data tables, but these are just few hundreds of rows, map-side join use chesthanu to extract business intelligence |
![]() ![]() ![]() ![]()
Gandhiguevara
Legend Username: Gandhiguevara
Post Number: 61719 Registered: 10-2009 Posted From: 67.191.221.7
Rating: N/A Votes: 0 (Vote!) | | Posted on Saturday, September 24, 2016 - 01:19 pm: |
![]() ![]() ![]() ![]() ![]() |
Myselfme:
Camus is being phased out... and Gobblin replace chesthunnatlundhi daanini... but ekkadaa prod lo implement chesinatlu references levu |
![]() ![]() ![]() ![]()
Gandhiguevara
Legend Username: Gandhiguevara
Post Number: 61718 Registered: 10-2009 Posted From: 67.191.221.7
Rating: N/A Votes: 0 (Vote!) | | Posted on Saturday, September 24, 2016 - 01:16 pm: |
![]() ![]() ![]() ![]() ![]() |
Siloan:
very expensive... open source options choosthunna |
![]() ![]() ![]() ![]()
Gandhiguevara
Legend Username: Gandhiguevara
Post Number: 61716 Registered: 10-2009 Posted From: 67.191.221.7
Rating: N/A Votes: 0 (Vote!) | | Posted on Saturday, September 24, 2016 - 01:15 pm: |
![]() ![]() ![]() ![]() ![]() |
Myselfme:
so mee env lo relational data as a source ledhaa? |
![]() ![]() ![]() ![]()
Myselfme
Junior Artist Username: Myselfme
Post Number: 316 Registered: 04-2011 Posted From: 73.22.114.102
Rating: N/A Votes: 0 (Vote!) | | Posted on Saturday, September 24, 2016 - 11:58 am: |
![]() ![]() ![]() ![]() ![]() |
Emc2, Nenu adhe ardhama chesukodaniki try chesthunnanu, ippati varaku HDFS data malli Greenplum loki rasthunnaru, BI vallu, data science vallu Greenplum queries run chesi vallaki kavalsindhi teesukuntaru. idhi anta yendhuku direct ga Impala use chesi Hadoop nodes meedhe queries run chesukovachukdha, hundreds of nodes Greenplum yendhuku? traditional ga Greenplum ni data warehouse ga use cheyyatamanukuntunna, one thing is data security, Greenplum permissions maintain cheyyochu, Hadoop is little complicated to maintain, ippudu koncham easy ayindattundhi (ldap+kerberos+cloudera supporting encryption), few packages cloudera, hortonworks release chesayi athentication and authorization ki |
![]() ![]() ![]() ![]()
Emc2
Legend Username: Emc2
Post Number: 50765 Registered: 03-2008 Posted From: 173.66.111.159
Rating: N/A Votes: 0 (Vote!) | | Posted on Saturday, September 24, 2016 - 11:47 am: |
![]() ![]() ![]() ![]() ![]() |
Gandhiguevara:
what else you are doing in green plum other than using as DB? what other futures you are using? |
![]() ![]() ![]() ![]()
Myselfme
Junior Artist Username: Myselfme
Post Number: 315 Registered: 04-2011 Posted From: 73.22.114.102
Rating: N/A Votes: 0 (Vote!) | | Posted on Saturday, September 24, 2016 - 11:44 am: |
![]() ![]() ![]() ![]() ![]() |
Data ingestion ante mostly data HDFS write cheyyadam, maku east coast, west coast, europe lo adservers untayi, adservers Flume loki logs (clicks, impressions) rastayi. these are Flume compression agents, avi data ni push chestayi, all data anta chicago data center loki vasthayi, ee data center lo Flume landing agents pick chesukoni Kafka loki sink avuthadhi data, Kafka nundi HDFS loki write cheyyataniki Linkedin's Camus use chesthamu business logic kadha andhukani tried to write in Telugu, although not a secret |
![]() ![]() ![]() ![]()
Siloan
Legend Username: Siloan
Post Number: 47991 Registered: 03-2008 Posted From: 98.31.1.121
Rating: N/A Votes: 0 (Vote!) | | Posted on Saturday, September 24, 2016 - 11:34 am: |
![]() ![]() ![]() ![]() ![]() |
guvvi ..big integrate ani ibm vaadi tool irukke similar to bde |
![]() ![]() ![]() ![]()
Gandhiguevara
Legend Username: Gandhiguevara
Post Number: 61715 Registered: 10-2009 Posted From: 67.191.221.7
Rating: N/A Votes: 0 (Vote!) | | Posted on Saturday, September 24, 2016 - 11:34 am: |
![]() ![]() ![]() ![]() ![]() |
Emanna CDC / Replication of relational data chesaaraa? |
![]() ![]() ![]() ![]()
Gandhiguevara
Legend Username: Gandhiguevara
Post Number: 61714 Registered: 10-2009 Posted From: 67.191.221.7
Rating: N/A Votes: 0 (Vote!) | | Posted on Saturday, September 24, 2016 - 11:32 am: |
![]() ![]() ![]() ![]() ![]() |
"9 months mostly data ingestion into HDFS and Greenplum " Ingestion process yenti? emanna tools like attunity and streamsets vaadaaraa or Informatica BDE? |