Is it possible to remove specific items from Parquet and NoSQL Targets?

Question:

Is it possible to remove specific items from Parquet and NoSQL Targets in MLRun? I did not find relevant methods and I checked FeatureStore, ParquetTarget and NoSQLTarget.

I only saw ability to remove whole featureset from metastore (from DB) without touch of specific data items see:

mlrun.feature_store.delete_feature_set(name, project='', tag=None, uid=None, force=False)

But it is not my case, I have to remove only specific data items (not information from metastore). Thanks for help.

BTW: I am using MLRun version 1.2.1

Asked By: XiongChan

||

Answers:

I did not see relevant method in MLRun 1.2.1, but you can use a few work-arounds.

1. About ParquetTarget

  • this format is immutable
  • but in case that you use partitioning in parquet, than you can remove specific parquet file(s). E.g. if you use partitioning by years, you can easy delete file (via command line rm) on file system specific year and practically you delete requested content.

2. About NoSqlTarget

  • this is not immutable format and you can easy update value(s), but MLRun 1.2.1 is without relevant API for delete items
  • you can see persistence of each key on file systems (v3io is compatible with file system). It means you can also delete content (key or keys) via delete file(s) (again via command line rm).

3. About RedisTarget

  • it is near to NoSqlTarget (easy update of values)
  • you can see full support from redis for delete key(s)

command lines:

DEL user
redis-cli KEYS "user*" | xargs redis-cli DEL

code in python:

import redis

r = redis.Redis()
r.delete('test')
Answered By: JIST