Show HN: Verify JSON using minimal schema

tylerchr · 6 years ago

Very neat. I had a similar need recently which led to a similar solution[^1]. Notably, I used a syntax nearly identical to yours, albeit without the very handy pluggable validators yours has. Nice job.

What led you to choose “!” as the “optional” modifier? My intuition would have guessed that character to have the opposite meaning.

[1]: project: https://github.com/tylerchr/jstn/ — playground: https://tylerchr.github.io/jstn/

willvarfar · 6 years ago

A long time ago, before MyPy, I did a python schema checker for JSON. It grew into argument checking too: https://pypi.org/project/obiwan/

Simple stuff like:

   Person = { "name": str, "age": int }

The basic idea was to use a file full of obiwan types to document the rest api. The schema was both human-readable and valid python. The api could then trivially check the schema. Once the schema is checked, of course, its safe to go walk the json knowing that things aren't going to crash on you etc.

It worked great. Still works great. I still prefer the syntax to that of MyPy.

No real adoption, of course :) But fun to remember what could have been :)

wryun · 6 years ago

My vaguely similar thing - https://wryun.github.io/yajsonschema/

smoyer · 6 years ago

Maybe I'm in the minority but I don't want light-weight at the expense of compatible/standard. It's being recognized that validation is more important than we'd expected - the recent ACM guidelines for moderation on Postel's Law really resonated with my experiences with RESTful/JSON micro-services (https://queue.acm.org/detail.cfm?id=1999945). But if we're going to the effort of validating, let's do it in a language/framework agnostic way ... jsonschema.org is mentioned in other comments but I'm a fan of the OpenAPI initiative (https://www.openapis.org/). Then we can all share the validation rules!

yusufnb · 6 years ago

I understand. In my experience, the usability of those standards is pretty bad. Which in effect leads to projects not using anything at all.

One of the benefit of JSON over XML is that it is concise and fast to work with. The standard should reflect that part as well.

With this lib, as a developer to verify a JSON for a simple REST request is as simple as - `verify(json, "{a,b,c}")`

smoyer · 6 years ago

> leads to projects not using anything at all

We're counting that against products when we're analyzing what services to adopt/purchase. You doing nothing at all ... or something non-standard results in everyone else having to attempt to do it for you with the resulting inconsistencies. I'm not sure why you think OpenAPI is tailored for XML ... Here's the specification for the sample application (petstore) with service and model definitions in YAML - https://github.com/OAI/OpenAPI-Specification/blob/master/exa....

catlifeonmars · 6 years ago

This has already been said in the nested comments, but why not “?” for optional? Coming from Typescript, Swift, etc “!” feels more like an assertion of non-nullity (is that a word?). It’s also easy to read as “not <key>”.

yusufnb · 6 years ago

Hmmm, true. ? might be better. Let me think it through. I will update the package to support that.

yusufnb · 6 years ago

you can track this issue for that change - https://github.com/yusufnb/verify-json/issues/2

oefrha · 6 years ago

Exclamation mark for optional is just weird. Question mark would be a lot more intuitive, unless there’s an origin story here I’m not aware of.

angry_cactus · 6 years ago

Not sure of the exact motivation, but Swift uses exclamation mark to unwrap optionals.

oefrha · 6 years ago

In Swift optional types are marked by a question mark (exactly as I suggested). Unwrapping an optional makes it non-optional, so the exclamation point has the opposite meaning.

yusufnb · 6 years ago

Its updated now to work with ? syntax

Eikon · 6 years ago

It's crazy the amount of work that is spent to essentially implement static typing into dynamically / duck typed languages.

Projects like this, mypy and others makes the whole thing bizarre to say the least.

lewiscollard · 6 years ago

This has little to do with dynamically typed languages, "JS" in the name notwithstanding. JSON is a serialisation format that is used in both statically & dynamically-typed languages. It makes as much sense to call XML validators bizarre as it does to pin this project as a weird byproduct of dynamically typed languages.

If your program produces or consumes JSON _and_ you think it would be a good idea to ensure said JSON is what you are/a client is expecting, then you need something like this in any implementation language.

baq · 6 years ago

preach.

what I completely don't understand is the current fad on everything-should-be-yaml where yaml is marginally easier to write than XML, a tiny bit more than marginally easier to read (caveat emptor - Norway has a country code) and otherwise exactly as broken. (i'm talking about the 'safe' variant.)

BiteCode_dev · 6 years ago

I used to dislike type hints for Python because:

- I was afraid it would change the language culture

- It was not comfortable to use

- The idea and design was, as you said, bizarre

- Use use a dynamic language, why the hell would I want that?

Years after the fact, I'm really happy about it.

First, most of my Python code don't use them, so Python stays python. But if I ever need them, there are here.

Of course, you could argue I should just use a language that has been designed for that since I need type. But that's missing the point: I don't usually start using hints. I code regular Python, because it's awesome. I just sometimes add type hints after the fact as a bonus.

The ability to do that is fantastic, even if the type system is far from perfect.

I want to code with Python most of the time anyway. I personally have very few use cases for another language, doing mostly web stuff.

Type hints are still very verbose, and convoluted, although it will get better with 3.9 thanks to 2 PEP targeting type hints ergonomics. So sure, it's never going to fit perfectly, but I'm glad it's here.

Now for a JSON schema, I guess it's the same deal. You use JSON because it fits your use case, and one day you realize your situation could now benefit from more robustness, and you add it to the mix.

Again, you could argue you should have though of that at the beginning, but projects don't have frozen requirements, they evolve. And often, you want to start lean, because it will most probably stays that way, and otherwise would be so costly the project would never grow to a point you would need robustness.

Now in this particular case, this lib is not just checking type, but also arbitrary logic. And making sure data is consistent is necessary no matter how dynamic or static your language is. There is a limit to the constraints your type system can represent.

dehrmann · 6 years ago

I've worked with Python professionally for 3-5 years, but only type-hinted Python for ~5 months. This is coming from someone who prefers Java, but I don't like Python type hints because they were bolted onto the language worse than classes, type hierarchies aren't mature, and if I wanted types, I'd be using Java. That said, you do need strong typing for a project once it hits a certain line count or developer count.

dehrmann · 6 years ago

I led a project for validating customer-provided data against a json schema. The structure of the data was moderately complex, and it was getting bound to pojos by a pretty lenient binder. This caused issues when making additions to the API since customers were sending arbitrary fields, and duck typing the data wasn't that reliable. The easiest thing to validate before binding with a json schema.

What's different this and Python type hints is that this was a public-facing API, so you have far less control over what data you'll see, and the server code was Java, so getting things strongly typed and validated early on makes your life easier.

zmmmmm · 6 years ago

There are languages though like eg: Groovy where gradual typing is more designed in. It's unfortunate to me that Python has cultivated this attititude because it's like local maxima that prevents the mainstream getting to soemthing better. Python is the new perl / PHP etc.

thrwaway69 · 6 years ago

Have you ever checked dart?

Type inference can get you so so far

mhd · 6 years ago

It's not like bringing features you get for free from dynamic languages into static languages doesn't involve an crazy amount of work, either.

ken · 6 years ago

What statically typed language lets me define a type as “number between -180 and +180”, or “string which contains only alphanumeric chars”?

I think that would be a great feature but from what I see static typing fails here, too.

rraval · 6 years ago

> What statically typed language lets me define a type as “number between -180 and +180”, or “string which contains only alphanumeric chars”?

Pretty much all of them. Any simple predicate like this can be encoded with witness types.

Here's an example in Java, which is hardly the paragon of static typing (i.e. it's no Haskell/Idris/Agda/Rust/Typescript):

    class AlphaNumericString {
        private final String str;

        // use a fallible factory with a `private` constructor if you're
        // morally opposed to exceptions
        public AlphaNumericString(String str) throws AlphaNumericException {
            if (!str.matches("^[a-zA-Z0-9]*$")) {
                throw new AlphaNumericException();
            }

            this.str = str;
        }

        private static class AlphaNumericException extends Exception {
        }
    }

Now code can freely use `AlphaNumericString` and be guaranteed that it has been validated.

You may object and say that newtype wrapping is cumbersome but:

1. That's an argument about sugar and ergonomics, not about the semantics that the static type system enforces

2. Some languages make it easier to generate forwarding methods to the underlying type (a la https://kotlinlang.org/docs/reference/delegation.html)

3. The `AlphaNumericString` is describing a smaller set of values than `String`. In general, you should be strongly considering the methods you allow and make sure that all paths continue to enforce the semantics you intend.

wry_discontent · 6 years ago

It's not statically typed, but you can do this with clojure.spec

  (s/def ::number #(<= -180 % 180))
  (s/def ::my-string (s/and string? #(re-matches #"[a-zA-Z0-9]*" %)))

metalrain · 6 years ago

I think Idris does, see http://docs.idris-lang.org/en/latest/tutorial/typesfuns.html... But probably there is some horrible trade-off or limits that seem arbitrary.

arethuza · 6 years ago

I think Pascal's subranges would cover both those examples?

NB Last time I used Pascal was probably 35 years ago....

Edit: I would suspect Ada would have something like this.

JoeAltmaier · 6 years ago

Been missing this for 20 years.

catlifeonmars · 6 years ago

This is orthogonal to type safety. In fact, schema-based validation is an excellent way to perform type narrowing even in statically typed languages. An extremely prolific example of this is parsing incoming JSON from network requests :)

globular-toast · 6 years ago

I don't think you understand static typing and you might be confusing dynamic typing with duck typing or other concepts. The presence of types doesn't mean static typing. Python, for example, is strongly typed, but it's dynamic because the type of any object can't be determined until run time. But you can still check types!

junke · 6 years ago

Declaring and checking the type of values is still dynamic typing, I see no static types here.

gunn · 6 years ago

Why are schemas strings?

It would seem much more natural to make them JSON structures since they're almost that anyway.

yusufnb · 6 years ago

Strings make it easy to define as configs and can be implemented uniformly across different languages. In world of microservices, a Python and a Go service can easily share the schema definitions via some config files. Also strings provided the best minimalist syntax.

ZenPsycho · 6 years ago

then you might as well use jsonschema.

mikl · 6 years ago

Indeed, there seems to be no reason to use this instead of JSON Schema.

tckr · 6 years ago

JSON schema and ajv are my go to tools for this: https://github.com/epoberezkin/ajv

suref · 6 years ago

I built something similar for my latest webapp to validate json-requests in python, the syntax is like this:

    {
        "content": Required(str),
        "username": Required(str, validate=username_is_ok, transform=lambda x: x.lower(), 
                   fail_message="Username isn't valid"),
        "message": Optional(str),
        "some_list": Required({
            "name": Required(str),
            "date": Required(str)
        }, is_list=True)
    }

Then I can provide this in a hook to the request method.

1f97 · 6 years ago

can you tell me a bit more about this? i wanted something like this for a small api i have but ended up using marshmallow to validate jsons against defined schemas which i think is a bit overkill.

kingosticks · 6 years ago

If you haven't seen, these alternatives might also be helpful:

   * https://github.com/keleshev/schema
   * https://github.com/alecthomas/voluptuous
   * https://github.com/Pylons/colander