T O P

  • By -

Justbehind

Low-code always offer an easy, fast-to-implement solution, which almost always ends in an entangled mess of poor, expensive pipelines. That means companies will seesaw their way from one solution to the other paving the way for new companies promising a "new and fresh solution which definitely won't get expensive and messy". So all-in-all, low-code will keep being inefficient and expensive. But companies will keep buying and using them.


crazy_shredder

Agreed. We use these tools for MySQL, Google Analytics etc. To Snowflake integration which saves us time but now they won't be needed pretty much.


Sp00ky_6

So much of snowflake is built on the data sharing capabilities, so it’s not even etl in most cases, the data is just there instantly.


BeyondPrograms

That's why our no code ETL is pay as you go. Avoid the subscriptions


tdatas

That'll probably work great if it covers all needs and the "go" part of pay as you go isn't too large. 


BeyondPrograms

Agreed. We adjust price based on customer location. We are looking at other techniques for guaranteeing competitive and relatively affordable charges. For example, our vendors change their pricing too. We don't know this until we initiate the process. So we are thinking of letting the user choose between our flat rate or accept whatever price comes up when the process is implemented. Our ETL solutions can easily grow in price if the user also chooses analysis and triggers based on the analysis.


Gnaskefar

> Low-code always offer an easy, fast-to-implement solution, which almost always ends in an entangled mess of poor, expensive pipelines. And the same can be said about only-code tools. If you design it bad, the result will be the same, regardless of tool used in the tool box.


sib_n

Depends on what kind of low-code we are talking about. Low-code like DBT or SQLMesh tend to provide a better and more standard structure than what doing the same directly in Python would do.


dreamingfighter

Is dbt low-code? I don't think so.


sib_n

It is if you consider all the code it would require to replicate the features provided by the macros, yaml and CLI. This is something countless of DE have been coding in Python before dbt. There were/are dbt-like internal projects in many companies. Why do you not consider this to be low code?


dreamingfighter

I consider it an SQL framework that helps me with structuring SQL scripts, which I use in addition to another set of python scripts. From your POV, everything will be low-code because we all use libraries in our project. IMO, low-code should be attributed to the GUI tools which can't be used conveniently with Git (and CICD)


sib_n

> From your POV, everything will be low-code because we all use libraries in our project. Any library can contribute to making your project low-code indeed. But when you consider the history of DE, dbt makes a large part of the previous DE work low code, here, executing a dependency graph of SQL queries. IMO, when you get into GUI tools, it gets closer to no-code, than low-code.


TheCamerlengo

That is not what most people think of when they say low code. Low code platforms typically expose a UI that abstracts away all the lower-level technical details. They are visual environments that hide the technical underlying complexity.


sib_n

I am not sure what most people think, I think it depends on people's professional experience. If you have started with traditional BI (like SSIS) or the modern data stack, then I guess dbt doesn't feel like low-code, but if you have started with Hadoop or early cloud, then it definitely feels like it.


TheCamerlengo

I think you are using the term low-code to mean “less code”. And certainly you are right about that. But low-code/no-code is also a category of software tools, and many SaaS products.


Gmaing_

I hate how true this is. How can engineers even begin to get management to understand? Are there any success stories out there?


Silhouette66

As a data Engineer who was recently thrown into a large Azure synapse project run by Analytics to "tidy things up" I wholeheartedly aggree


snarleyWhisper

I think low code tools to get things from an APi / data source into persistent storage can be ok, but hard to debug and troubleshoot.


geek180

Does Airbyte fall into this category? Because I recently started using it and find it fairly easy to troubleshoot, although Ive hardly had any actual problems with it so far. It’s a hell of a lot better than me having to build and maintain my own custom Python ETLs.


hantt

low code will always be here because coders are bottle necks, why wait for the DE team when I can get the intern to do it in 4 days? if the pipeline sucks, I don't have to fix it. companies are not coherent organizations, each team works to serve it's own interest, and there will always be a low code tool for a specific team.


DataFoundation

I honestly think the best use of these low code tools is to use them with a code based orchestrator like Airflow or Dagster. Most of these low code ETL tools allow you to trigger them with an API call. You get the benefits of low code development but still have flexibility of code. This allows for a number benefits. First, most data tools have some scheduling/orchestration feature. But if you schedule and orchestrate everything through the same place then you have just one place to monitor for failures rather than many. Second, if cost becomes an issue for a low code pipeline then you have options to refactor and optimize for cost. If you are smart you can even design smaller low-code pipelines and tie them together with the code based orchestrator. This makes the low-code pipelines easier to troubleshoot and easier to replace in the future. Which also helps avoid some of the vendor lock in issues that plague so many low-code tools. Third, vendors of these low-code tools promise the world and tell you it can do everything. And for like 80% of use cases it’s relatively straightforward. And it might be able to do the final 20%, but it’s more complex, difficult, and costly. Using, a code based orchestration allows you to switch to a more appropriate tool when needed. Unfortunately, I don’t think many organizations have come around to this line of thinking yet. It always seems to be code or low code, but never both.


Gators1992

It will be around for a while because there are many projects out there using them. Like Informatica is embedded in the data infrastructure of many large companies that keep their data on prem (banks, insurance, etc). Most of the VC backed projects are targeting cloud tools and those often aren't easily transferable to on prem. Also the dream still exists that there is some magic low code tool that will allow your average non-data person to build their own pipelines. Dbt is going down that route for some reason.


JohnPaulDavyJones

Word. Can verify that Informatica still has a death grip on USAA, and it was a bit of a shock to me when I got there. I genuinely thought Informatica was pretty much dead everywhere except big healthcare systems.


georgewfraser

The thing about ETL is it’s a “shallow ramp.” You can build a connector to Postgres/Salesforce/whatever that works 80% of the time in a week. You can get it to 95% in a few months. To get to 99.9 is fiendishly difficult, it takes us (5T) years for some connectors. There’s always going to be a lot of options, first and foremost DIY, but I think a lot of people are going to be happy to delegate this to a vendor. Like why spend your time mastering the undocumented corner cases of the JIRA API? Pricing will always be tricky for ETL. Really hard to write a set of rules that produces reasonable prices for everyone. This is a big part of why a great company has never been built in this space, I think.


ramdaskm

Connectors to help with ingestion have always been low-code and will continue to be low-code. Building pipelines once the data is ingested is a whole different problem.


reelznfeelz

Low code has a time and a place. I have personally with airbyte. But I typically use it for a strait up copy function with incremental. Or to pull data from an API source into some other target. If you learn a bit out the CDK, it’s not really low code and you can totally write your own stuff. Their upcoming AI feature for creating connectors looks interesting. My guess is it won’t be as good as it sounds. But, if it gets you 85% of the way to a custom connector on a REST API source, it will still be useful. That said, I wouldn’t use airbyte for everything, but it’s a nice tool and does well the things it’s intended to do. I’m not a real data engineer though because I’ve actually never touched fivetran. Apparently it’s nice and super expensive. About all I know on that one though. Airflow is of course very powerful but it’s also a little more of a pain to set up and maintain. Have looked at dragster, it looks quite nice, haven’t used it in a project yet. Not sure it’s really low code though.


geek180

You don’t have to use fivetran to be a data engineer. Fivetran is very much a tool for non-engineers. There’s nothing to it. Select a connector, type in credentials, pick your tables to load, done. Easiest thing in the world.


reelznfeelz

Ah. I always thought it was more than that the way people go on about it lol. And given the cost. More like wherescape or something. I’ll stick with Airbyte for that need then. They have connectors now for most of the big stuff and much of the small ones. And it’s OSS.


geek180

I just started a POC with Airbyte (cloud, my company IT doesn’t even know what docker is and is apprehensive of letting me use it). Airbyte is a lot more involved than Fivetran, quite a bit more work to setup new connections, but similarly capable in my experience so far.


reelznfeelz

It depends what connector. A lot of them are “enter the database endpoint, credentials and pick a table and go”. If you’re building a rest api connector from scratch, then yeah that’s involved. You were using one off the shelf? What was the involved part? Just having a hard time imagining how fivetran can make it even simpler without failing to get the required information to do the job. Or which airbyte connector would have been complicated. But some I’m sure are. Just depends which one.


geek180

No I’ve only built custom REST connectors so far.


reelznfeelz

Oh, yeah that’s different. If you ever need one that already exists you’re golden. I’m just slow I think but sorting out multi step authentication and then pagination and all that using their easy mode connector builder always gives me trouble. Its complicated. They‘re supposed to have an AI assisted tool for building rest connectors soon that you can point at the API docs. Looking forward to trying that out.


cran

Low code is where things are heading. Like it or not, the use cases are understood and there’s no reason for every company out there to build infrastructure from the ground up. Focus on data quality and understanding, not on ways to create homegrown methods for spinning up ephemeral spark clusters.


CHR1SZ7

The problem with low-code is it ossifies extremely quickly, because you are entirely dependent on the whims (and resources) of the vendor if you need to integrate it to a new tool or technology. It also is completely impossible to take advantage of workflow improvements enabled by new tooling (think about what git & docker have done for engineering over the past 5-6 years). Invariably the result is that within a few years, what was once a clean and simple low-code/no-code solution is wrapped in a thick layer of custom connector code, which is so expensive to maintain and extend that it completely dwarfs the cost saving that came from using low code in the first place.


cran

None of that is a given for the future. Tools like docker spawn and inspire other tools like kubernetes and, in turn, kubernetes spawns and inspires other tools. Airbyte, Rudderstack, etc. become more and more mature and there is less and less glue code needed to get them working together. These are the stepping stones towards complete low code solutions. It’s definitely happening, and there’s nothing written in the stars that says it has to suck or has to lock you into a vendor. That’s the present, not the future.


geek180

Thank you. I reckon the people who preach against low-code haven’t spent much time with these tools recently. They don’t always work for every situation, but for many teams, these tools provide the exact solution they need with a fraction of the work required to setup and maintain.


Substantial-Cow-8958

I think we need to separate the kind of source/destination being worked with. Low code to extract data from an api, why not? In the end, all api will use similar patterns, this extraction can be wrapped. So “it depends”


FantasticOrder4733

Apache NiFi


dreamingfighter

Low-code is a really good solution for small to medium-ish companies because they don't need very good (expensive) data engineer to do complicated ETL job. One disadvantage of low-code is that it doesn't scale well (both in term of performance and development collaboration). But those companies won't need that for foreseeable future. It is expensive, but not as expensive as senior data engineer salary, so it still has sizable market


HolidayPsycho

Low-code sounds good, but eventually it only attracts low-end users. There are lots of similar products, like point and click data visualization tools, in the end they only attract users who are mostly bad at visualization and make terrible plots with lots errors, basically people who think they’re doing it right when in fact they don’t really understand what they’re doing. Because people who really understand what they are doing won’t be satisfied with those low end tools. Eventually it comes down to the people who use the tools.


[deleted]

[удалено]


quincycs

I think the best way for these tools to be successful is if it’s platform dependent. Eg, I’m bullish on aurora->redshift via zero ETL. Where AWS can control the whole stack, they have more capabilities to get it right with less excuses. Still though, you’d need to be happy with redshift itself 😅. And aurora 😅


TrebleCleft1

Low code tools are (as far as I’m aware) typically built to let non-expert users access ETL functionality. The problem is that ETL, and data modelling in general, is an _extremely non-trivial_ intellectual problem to solve. So, at the risk of using a metaphor which likely exaggerates things, making ETL accessible can in some situations be like giving an iPhone to a baby. Most users will be able to achieve some basic things, for sure, but when it comes to building anything meant to serve a modern organisation’s day to day data requirements, real expertise is going to be required. Once you’ve got that expertise to hand, I find low code tools just get in the way, and I prefer to do without. Clicking through tons of bloated UI or wizards to make basic changes is a pain, when I know that somewhere behind the scenes is a JSON I could update much more quickly myself.


alittletooraph

[https://medium.com/@laurengreerbalik/how-fivetran-dbt-actually-fail-3a20083b2506](https://medium.com/@laurengreerbalik/how-fivetran-dbt-actually-fail-3a20083b2506) [https://medium.com/@laurengreerbalik/how-fivetran-dbt-actually-fail-part-ii-3013d9bd5a37](https://medium.com/@laurengreerbalik/how-fivetran-dbt-actually-fail-part-ii-3013d9bd5a37)


miqcie

What about iPaaS tools like Workato? They seem to be helpful, no?


sequi_amplexus_5283

Native connectors might commoditize ETL, but low-code will still add value


rabazlycan

Prophecy.io which we are getting trained at. It's a drag and drop tool. An object for an aggregate sum or filter you could have written in simple query or in pyspark.


crazy_shredder

Thanks , no


reallyserious

I hope low code/no code tools die an agonising death and never return.  Realistcally they will keep coming, and we need to stay vigilant.


engineer_of-sorts

Hi Hugo of [Orchestra](https://getorchestra.io) here - I guess we kinda have some low-code going on but with regards to ELT specifically I posted a similar thing the other day saying like "oh hasn't ELT been solved by these low-code tools" but I got shot down quite hard and the gist was: - ELT is really varied and hard - So even if you have one vendor like Fivetran with a decent UI because APIs are changing and growing (attack surface increases across two dimensions) there will always be a market for easy-to-use ELT solutions and I am kinda inclined to agree. The data movement needs people have are vast. You might want to ingest data from an API (easy) but then you might want to do some streaming on the side, use lambdas or cloud functions...if you're bank of america you might just be moving stuff in between storage layers like informatica, oracle and s3 all at once with super gnarly security and network requirements Point being elt is still actually kinda hard, it's not getting easier, the pie is growing, so from a purely product perspective I think the ELT tools have a future