An Ideal Homestead
Oct. 10th, 2018 06:26 amWe are information processes defined by what we do. We are part of the world, that grow within our immediate information sources. We have devised a variety of systems with useful properties to deal with data access permissions, querying, serialization, analytics, search... the last four of which are quite synonymous -- they are all about information retrieval, and the last five of which fall under a single word -- sharing.
I had been thinking of an ideal information system, that would transcend the boundaries of database schemas, and the boundaries of disk files. What I mean, is a system that would allow to give permissions to any single byte, and allow to integrate and manage records of any type and in any language, so that when we write something on our home system, then it is universally reusable if we want to, and when we read something from other systems, we bring the control interface together to our system, so that we can interact with those objects on other systems without manually revisiting those other systems.
The question is -- how?
APIs create interfaces to objects (they objectify systems), defining limited set of functions to interact with them. That makes the data behind them only as useful as the API defines, no more no less. If you want a single byte that Bob shared to Alice, you have to implement user system, and then special functions to return those type of queries. However, if you want the sysadmin not to be able to see it, you have let Bob and Alice encrypt the data with Bob and Alice's keys. The best example of that is Protonmail. However, for a function to be able to extract a single byte that Bob shared with Alice, the function needs to see the mask, and apply it to source text. If the function that does it, is compromised, then the system administrator may see the original text.
So, I thought, there has to be some dumb and simple function, and metalanguage that you use to mark the fragments of text within your texts, that allows designating whom to share it with. To illustrate the idea, I've created the sharesplit, which works by introducing a tag {:CONDITIONS|CONTENT:}. For example, to share something with Alice, you'd write like this {:Alice|And, btw., this should be only visible to Alice:} within the text.
The SQL in combination with very specific schemas for tables and APIs allow software engineers to design efficient structures to deal with very specific I/O or analytics and permission systems. However, the generic need for integration and analysis of all data from all sources, means it is time-inefficient to come up with new schemas to import that data every single time.
To unify all the different schemas, which can be very diverse, I have come up with a metaform, based on ontologies, that allows combine the datasets by first, writing raw data with schemas, where each record has a schema attribute, like so:
The key with star (asterisk) specifies, where the explanation of the schema lives, and that allows to normalize it with something like pip install ooio:
This translates the data into data with fields normalized to ontological vocabulary defined by the map provided in the schema link (in this case https://github.com/wefindx/ooio/wiki/example#test1) at arbitrary levels of JSON nestedness, which is ready to be saved to information retrieval system, where rules can be applied at the value of keys level, to apply the sharesplit slicer, or any other filters to share fragments of information with whom necessary.
This pattern, unlike JSON-LD, is not constrained to a single parent vocabulary (@context), it allows combining multiple vocabularies on a per-key basis. For example, it is possible to use term from one vocabulary to one field, and from another vocabulary to another field of the same record, this way, making use of multiple ontologies at once.
In other words, this is a bit like speaking multiple languages in the same sentence, but if there is a good term for something in language A that doesn't exist in language B, why not to use both languages at once, rather than trying to introduce and define loan words, or trying to implement the features of one language in another, making languages compete?
But what about that bit, of "and when we read something from other systems, we bring the control interface together to our system, so that we can interact with those objects on other systems without manually revisiting those other systems."?
HERE WILL BE A VIDEO DESCRIBING IT...
The good bit -- this allows to take raw applications data, and understand it. Take raw source code, and auto-understand it, save it to general purpose database, and when we change it, automatically change object in its origin.
Maximally Flexible Sharing
I had been thinking of an ideal information system, that would transcend the boundaries of database schemas, and the boundaries of disk files. What I mean, is a system that would allow to give permissions to any single byte, and allow to integrate and manage records of any type and in any language, so that when we write something on our home system, then it is universally reusable if we want to, and when we read something from other systems, we bring the control interface together to our system, so that we can interact with those objects on other systems without manually revisiting those other systems.
The question is -- how?
NoAPI, a metamarkup instead
APIs create interfaces to objects (they objectify systems), defining limited set of functions to interact with them. That makes the data behind them only as useful as the API defines, no more no less. If you want a single byte that Bob shared to Alice, you have to implement user system, and then special functions to return those type of queries. However, if you want the sysadmin not to be able to see it, you have let Bob and Alice encrypt the data with Bob and Alice's keys. The best example of that is Protonmail. However, for a function to be able to extract a single byte that Bob shared with Alice, the function needs to see the mask, and apply it to source text. If the function that does it, is compromised, then the system administrator may see the original text.
So, I thought, there has to be some dumb and simple function, and metalanguage that you use to mark the fragments of text within your texts, that allows designating whom to share it with. To illustrate the idea, I've created the sharesplit, which works by introducing a tag {:CONDITIONS|CONTENT:}. For example, to share something with Alice, you'd write like this {:Alice|And, btw., this should be only visible to Alice:} within the text.
NoSQL, a generic hashmap instead
The SQL in combination with very specific schemas for tables and APIs allow software engineers to design efficient structures to deal with very specific I/O or analytics and permission systems. However, the generic need for integration and analysis of all data from all sources, means it is time-inefficient to come up with new schemas to import that data every single time.
To unify all the different schemas, which can be very diverse, I have come up with a metaform, based on ontologies, that allows combine the datasets by first, writing raw data with schemas, where each record has a schema attribute, like so:
item = {"field1": "Joe", "field2": {"field3": "21"}, "*": "https://github.com/wefindx/ooio/wiki/example#test1"}
The key with star (asterisk) specifies, where the explanation of the schema lives, and that allows to normalize it with something like pip install ooio:
import ooio ndata = ooio.metaform.normalize(item, ooio.get_schema(item['*'])) ndata {'_:username#string': 'Joe', 'field2': {'_:age#float-years': '21'}, '*': 'https://github.com/wefindx/ooio/wiki/example#test1'} ooio.metaform.formatize(ndata) {'_:username': 'Joe', '_:properties': {'_:age': 21.}}
This translates the data into data with fields normalized to ontological vocabulary defined by the map provided in the schema link (in this case https://github.com/wefindx/ooio/wiki/example#test1) at arbitrary levels of JSON nestedness, which is ready to be saved to information retrieval system, where rules can be applied at the value of keys level, to apply the sharesplit slicer, or any other filters to share fragments of information with whom necessary.
This pattern, unlike JSON-LD, is not constrained to a single parent vocabulary (@context), it allows combining multiple vocabularies on a per-key basis. For example, it is possible to use term from one vocabulary to one field, and from another vocabulary to another field of the same record, this way, making use of multiple ontologies at once.
In other words, this is a bit like speaking multiple languages in the same sentence, but if there is a good term for something in language A that doesn't exist in language B, why not to use both languages at once, rather than trying to introduce and define loan words, or trying to implement the features of one language in another, making languages compete?
But what about that bit, of "and when we read something from other systems, we bring the control interface together to our system, so that we can interact with those objects on other systems without manually revisiting those other systems."?
HERE WILL BE A VIDEO DESCRIBING IT...
The good bit -- this allows to take raw applications data, and understand it. Take raw source code, and auto-understand it, save it to general purpose database, and when we change it, automatically change object in its origin.