Readit News logoReadit News
IvanVergiliev commented on Pg_hint_plan: Force PostgreSQL to execute query plans the way you want   github.com/ossc-db/pg_hin... · Posted by u/justinclift
IvanVergiliev · a year ago
Using the hint table has been pretty painful, in my experience. Two main difficulties I’ve seen: 1. The hint patterns are whitespace-sensitive. Accidentally put a tab instead of a couple of spaces, and you get a silent match failure. 2. There are different ways to encode query parameters - `?` for literals in the query text, `$n` for query parameters. Psql / pgcli don’t support parameterized queries so you can’t use them to iterate on your hints.

Still super useful when you have no other options though.

IvanVergiliev commented on O(1) Build File   matklad.github.io/2023/12... · Posted by u/sam_bristow
dharmab · 2 years ago
Similarly, it's a common pattern to have a Makefile for each directory in a C project.
IvanVergiliev · 2 years ago
Just an FYI in case you haven’t read this: Recursive Make Considered Harmful [1]

[1] https://aegis.sourceforge.net/auug97.pdf

IvanVergiliev commented on Anki-fy your life   abouttolearn.substack.com... · Posted by u/mililitre
Vaslo · 2 years ago
I use anki with my young son on sight words. Works great but wish it was a little easier to do what I want instead of following protocol exactly. I know the whole point of anki but sometimes I just want to test him on older stuff or stuff he hasn’t seen for various reasons. Can never quite figure it out on the phone.
IvanVergiliev · 2 years ago
You can use a “Custom Study Session” [1] for this. I tried it recently and it was fairly decent.

One possible way to get there on mobile: 1. Start reviewing a deck. 2. Click the gear button. 3. Choose “Custom study” and select one of the options. 4. If studying tagged cards, go tag some cards via the Browse view.

[1] https://docs.ankiweb.net/filtered-decks.html#custom-study

IvanVergiliev commented on Accidentally exponential behavior in Spark   heap.io/blog/accidentally... · Posted by u/drob
mjburgess · 4 years ago
Is there anything you can say here about why you're running this query in spark?

Supposing spark is your ETL machinery... would it not make more sense to ETL this into a database?

IvanVergiliev · 4 years ago
Definitely. One of the primary benefits we get out of Spark is the ability to decouple storage and compute, and to very easily scale out the compute.

Our main Spark workload is pretty spiky. We have low load during most of the day, and very high load at certain times - either system-wide, or because a large customer triggered an expensive operation. Using Spark as our distributed query engine allows us to quickly spin up new worker nodes and process the high load in a timely manner. We can then downsize the cluster again to keep our compute spend in check.

And just to provide some context on our data size, here's an article about how we use Citus at Heap - https://www.citusdata.com/customers/heap . We store close to a petabyte of data in our distributed Citus cluster. However, we've found Spark to be significantly better at queries with large result sets - our Connect product syncs a lot of data from our internal storage to customers' warehouses.

IvanVergiliev commented on Accidentally exponential behavior in Spark   heap.io/blog/accidentally... · Posted by u/drob
IvanVergiliev · 4 years ago
Post author here. Let me know if you have any questions!
IvanVergiliev commented on Accidentally exponential behavior in Spark   heap.io/blog/accidentally... · Posted by u/drob
lanna · 4 years ago
It looks like transform(tree.left) returns an Option[Tree] already (otherwise the code would not type check) so the entire if-else in the original code seems redundant and could be replaced with:

    val transformedLeft = transform(tree.left)

IvanVergiliev · 4 years ago
Just responded to the parent comment as well - there's an additional mutable argument to the real `transform` method so it's unsafe to invoke it directly without first checking if the tree is convertible.
IvanVergiliev commented on Accidentally exponential behavior in Spark   heap.io/blog/accidentally... · Posted by u/drob
mlyle · 4 years ago
Why not just...

  val transformedLeftTemp = transform(tree.left)

  val transformedLeft = if (transformedLeftTemp.isDefined) {
    transformedLeftTemp
  } else None

IvanVergiliev · 4 years ago
Good question, the simplified example doesn't make this clear.

The real implementation has a mutable `builder` argument used to gradually build the converted filter. If we perform the `transform().isDefined` call directly on the "main" builder, but the subtree turns out to not be convertible, we can mess up the state of the builder.

The second example from the post would look roughly like this:

  val transformedLeft = if (transform(tree.left, new Builder()).isDefined) {
    transform(tree.left, mainBuilder)
  } else None

Since the two `transform` invocations are different, we can't cache the result this way.

There's a more detailed explanation in the old comment to the method: https://github.com/apache/spark/pull/24068/files#diff-5de773... .

IvanVergiliev commented on Old, Good Database Design   relinx.io/2020/09/14/old-... · Posted by u/jelnur
mcny · 5 years ago
I'd like to know this as well. I think you'll just have to build things (potentially horribly) and fail.

I took three semesters of database (granted, baby database classes) and I still have no idea how you can do something pretty straightforward like creating a room reservation system.

If there is a reservation beginning at 10:15 AM and ending at 12:30 PM and someone tries to book a reservation from 10:00 AM to 10:30 AM, the transaction should fail.

and before someone screams db2! yes, db2 can. but then you'd have to use db2 https://www.ibm.com/support/knowledgecenter/SSEPGG_11.1.0/co...

Why is this so difficult...

IvanVergiliev · 5 years ago
Exclude constraint on a GiST index?

https://stackoverflow.com/a/51247705

u/IvanVergiliev

KarmaCake day86March 16, 2012View Original