Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move artifact extractor logic to classifier #1268

Merged

Conversation

chunyong-lin
Copy link
Contributor

@chunyong-lin chunyong-lin commented Jul 2, 2020

to: @airbnb/streamalert-maintainers
related to: #1250
resolves: #1265

Background

Unfortunately, Firehose Data Transformation is expensive, it charges us twice for the same data sent to a Firehose delivery stream. So we are moving the Artifact Extractor logic to Classifier, that will only increase Classifier lambda running time by few seconds.

Changes

  • Move artifact extractor logic to classifier, so the artifact extractor lambda function is deleted, as well as its related resources.
  • Add send_to_artifacts flag to normalizer, so we only send interesting information to artifacts table.
  • Update custom metrics for artifacts.
  • Update unit test cases.
  • Update Normalization docs.

Testing

Tested the new code in staging account and it is working as expected.

@chunyong-lin chunyong-lin changed the title Move artifact extractor logic to classifier [WIP] Move artifact extractor logic to classifier Jul 2, 2020
@chunyong-lin chunyong-lin changed the title [WIP] Move artifact extractor logic to classifier Move artifact extractor logic to classifier Jul 2, 2020
@chunyong-lin chunyong-lin marked this pull request as ready for review July 3, 2020 01:38
Copy link
Contributor

@Ryxias Ryxias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good luck

@chunyong-lin chunyong-lin merged commit ca63116 into release-3-3-0 Jul 6, 2020
@chunyong-lin chunyong-lin deleted the move_artifact_extractor_logic_to_classifier branch July 6, 2020 21:11
ryandeivert added a commit that referenced this pull request Aug 5, 2020
* bumping version to 3.3.0

* Demisto playbook (#1239)

* Supports dynamic parameters (#1244)

* Add dynamic param support

* I am caveman unga bunga smash

* Oogey boogey beh

* [apps][aliyun] Set EndTime in the request (#1247)

* [apps][aliyun] Set EndTime in the request

* github action sucks, changing comment to retrigger action

Co-authored-by: Chunyong Lin <[email protected]>

* updating metavar/description for cli flag (#1252)

* support for packaging user specified conf directory (#1253)

* adding packaging support for user specified config path

* adding test for copying directory to alternate destination

* pr feedback

* adding fix for omitting coverage for forks (#1256)

* another attempt at coveralls BS (#1257)

* Update getting-started.rst (#1254)

The current Getting Started instructions don't mention that you need to add the `set` command. As it stands, this is the error I received when setting up Streamalert:

```
(.env) jordan@mac:~/src/aws/streamalert/streamalert$ python manage.py output aws-sns
usage: manage.py output [-h]
                        {set,set-from-file,generate-skeleton,get,list} ...
manage.py output: error: invalid choice: 'aws-sns' (choose from 'set', 'set-from-file', 'generate-skeleton', 'get', 'list')
```

Co-authored-by: Ryxias <[email protected]>
Co-authored-by: ryandeivert <[email protected]>

* Feature artifact extractor (#1250)

* bumping version to 3.2.0

* migrating Athena function to use tf_lambda module (#1217)

* rename of athena function

* updating terraform generation code to use tf_lambda module

* updating tf_athena module to remove lambda code

* updates for packaging, rollback, and deploy

* misc updates related to config path renaming, etc

* removing no-longer-used method (athena is default)

* addressing PR feedback

* adding more granular time prefix to athena client

* fixing duplicate resource issues (#1218)

* fixing duplicate resource issues

* fixing some other bugs in #1217

* fixing tf targets for athena deploy (#1220)

* adding "--config-dir" flag to CLI to support specifying path for config files (#1224)

* adding support for supplying path to config via CLI flag

* misc touchups

* updating publishers to accept configurable paths (#1223)

* moving matchers outside of rules directory

* updating rules for new matcher path

* updating unit test for consistency

* making publisher locations configurable

* fixing typo

* updating tf_lambda module to remove extra resources (#1225)

* fixing rollback for all functions, removing 'all' flag for function deploys (#1222)

* updating rollback functionality to include all funcs

* updating tests to check for rollback of all funcs

* updating docs

* fixing tf cycle and index issue (#1226)

* [core] Artifact Extractor lambda code

* [core] load firehose client for artifact extractor

* [core] Move FirehoseClient to shared folder

* [test] Here we go pylint

* [docs] Add high level Normalization doc

* Ooops, leftover print

* Address coment about docc

* bumping version to 3.3.0

* Remove a FIXME comment

* Add terraform resources

* Fix some issues discovered during terraform build

* [test] Add unit test cases and tune some code during testing

* [cli] update artifact extractor module resource for lambda deploy

* [doc] Update docstring

* pylint

* Address comments

* Address more comments

* [bugs] Fixed couple bugs before normalization code change

* [core] Refactor normalization code, unit test cases and add new ones

* [core] Re-implement normalization code \O/

* [docs] Update docs

* [docs] More docs

* Rework normalization logic to use key path from conf/schemas/*.json to find original key

* [tests] update unit test cases

* [rule][conf] Update conf right_to_left_character rule to use new normalization

* [docs] Update docs and address comments

* Fix a bug and update the unit test helper

* Remove unnecessary comments

* buggy, remove None values from normalization field

* Add record id to artifacts and record

* [tf] Upgrade terraform aws provider to 2.48.0

* Add condition to normalizer

* [docs] Update docs

* Address comment

* Add three custom metrics

* [cli] fix undeclared module issue related to artifact_extractor

* [doc] Update artifact extractor deploy instruction

Co-authored-by: Ryan Deivert <[email protected]>
Co-authored-by: Chunyong Lin <[email protected]>

* [config] Add Okta log schema (#1263)

* [config] Add Okta log schema

* Add test record

* Fix tests

* Fix tests

Co-authored-by: Matt Muller <[email protected]>

* Add additional G-Suite Admin Audit types. (#1260)

Co-authored-by: darkjokelady <[email protected]>

* Update getting-started.rst (#1255)

* Update getting-started.rst

Fix path to `cloudtrail_root_account_usage.py` rule being modified in the Getting Started documentation.

* test ci change in fork

* second update for ci tests in forks

Co-authored-by: Ryxias <[email protected]>
Co-authored-by: ryandeivert <[email protected]>
Co-authored-by: ryandeivert <[email protected]>
Co-authored-by: darkjokelady <[email protected]>

* [core] fix bug when normalization config empty (#1262)

* [core] fix bug when normalization config empty

* [test] Update unit test case

* [docs] Update how to search artifacts table

Co-authored-by: Chunyong Lin <[email protected]>

* CLI support for extra user supplied terraform files (#1267)

* adding cli arg to supply additional terraform config files

* removing old tf cleanup code since temp path will be used

* cliconfig support for temp tf directory

* updates to tf_runner and run_command for temp tf path

* removing tf clean command since runs are now idempotent

* packaging change for tf temp path

* logic for copying files to tf temp path

* removing init backend option

* cleanup

* fix unit tests

* config support for extra tf files

* doc update for `terraform_files` setting

* unit test for cliconfig terraform files

* fix for init backend outside of generate logic

* update to support supplying static dir for builds

* fixing issue with streamalert.zip not existing at build times (#1269)

* Move artifact extractor logic to classifier (#1268)

* [core] Move artifact extractor logic to classifier

* [core] Add send_to_artifacts flag to normalizer

* [cli] Remove leftover variables, permissions

* [core] Fix bugs, update custom metrics for artifacts

* [tests] Update test cases

* [docs] Update docs

* [cli] Update artifact_extract.tf.json path after PR #1267 merged

Co-authored-by: Chunyong Lin <[email protected]>

* rebuilding pkg on every tf run (#1270)

* ensuring prefix is a lowercase string (#1272)

* updating dependencies (#1277)

* updating deps

* updating precompiled deps

* misc cleanup

* [core][apps] Increase aliyun timeout (#1274)

Co-authored-by: Chunyong Lin <[email protected]>

* proper cloudwatch events permissions for cross account access (#1276)

* updating cloudwatch events module to support advanced event brige rule

* adding proper support for cloudwatch event permission for cross account cwe

* terraform gen code for new cross account cwe perms

* doc updates for x-acct cwe perms

* fix readme

* reverting usage of cloudformation stack

* allowing optional scopes

* proper provider support for different regions

* fixing pylint

* adding role arn to target

* installing venv in vagrant (#1278)

* fixing copying of zips, since lambda layers are zips (#1279)

* cloudtrail module config tweak (#1280)

* updating cloudtrail module config slightly

* updating unit tests and docs for cloudtrail module change

* fixing default for enable_events

* update to docs

* raising exceptions when error occurs while downloading from s3 (#1281)

* raising exceptions with s3 download errors

* fixing unit test

* addressing issue with 0 byte files in s3 (#1284)

* adding support for other accounts to publish to sns topic (#1283)

* fixing a bug I think but who really knows (#1285)

* adding fix for #1282 (#1286)

Co-authored-by: Ryxias <[email protected]>
Co-authored-by: darkjokelady <[email protected]>
Co-authored-by: Chunyong Lin <[email protected]>
Co-authored-by: Jordan Wright <[email protected]>
Co-authored-by: themullinator <[email protected]>
Co-authored-by: Matt Muller <[email protected]>
Co-authored-by: Gavin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants