This patch does the following:
1. Remove testing files from disk because they are no longer required.
2. Load completions from previous version of HashStore until an update
is applied.
3. Older version of HashStore(.sbstore) & PrefixSet(.vlpset) will be
removed during an update
Differential Revision: https://phabricator.services.mozilla.com/D36002
--HG--
extra : moz-landing-system : lando
Completions are now stored in .vlpset, we can remove it from .sbstore
Functions related to optimize reading completions from .sbstore can also
be removed because it is no longer HashStore's responsibility
Differential Revision: https://phabricator.services.mozilla.com/D34574
--HG--
extra : moz-landing-system : lando
Here is the flow how prefixes are handled during an V4 update:
1. Prefixes are received from Safe Browsing update, stored in ProtocolBuffer
2. Copy the prefixes from ProtocolBuffer to TableUpdate structure
3. Prefixes in TableUpdate are merged with local prefixes (stored in LookupCacheV4)
4. Merged prefixes are processes by PrefixSet to generate the in-memory prefix
set data structure (MakePrefixSet).
In this patch, we free the prefixes stored in TableUpdate right after step3.
This reduces the peak memory used during an update (peak happens in step 4).
Differential Revision: https://phabricator.services.mozilla.com/D26860
--HG--
extra : moz-landing-system : lando
SafeBrowsing V4 protocol use SHA-256 as the checksum to check integrity
of update data and also the integrity of prefix files.
SafeBrowsing V2 HashStore use MD5 as the checksum to check integrity of
.sbstore
Since we are going to use CRC32 as the integrity check of V4 prefix files,
I think rename V4 "checksum" to SHA256 can improve readability.
Differential Revision: https://phabricator.services.mozilla.com/D21460
--HG--
extra : moz-landing-system : lando
SafeBrowsing V4 protocol use SHA-256 as the checksum to check integrity
of update data and also the integrity of prefix files.
SafeBrowsing V2 HashStore use MD5 as the checksum to check integrity of
.sbstore
Since we are going to use CRC32 as the integrity check of V4 prefix files,
I think rename V4 "checksum" to SHA256 can improve readability.
Differential Revision: https://phabricator.services.mozilla.com/D21460
--HG--
extra : moz-landing-system : lando
I tried to make TableUpdateArray point to const TableUpdate objects
everywhere but there were two problems:
- HashStore::ApplyUpdate() triggers a few Merge() calls which include
sorting the underlying TableUpdate object first.
- LookupCacheV4::ApplyUpdate() calls TableUpdateV4::NewChecksum() when the
checksum is missing and that sets mChecksum.
MozReview-Commit-ID: LIhJcoxo7e7
--HG--
extra : rebase_source : f6ca4bf42d1ddb897a974a0b19c7185b0b6b93af
Manually keeping tabs on the lifetime of these objects is a pain
and is the likely source of some of our crashes. I suspect we might
also be leaking memory.
This change creates an explicit copy of the main array into the
update thread to avoid using a non-thread-safe shared data
structure. This is a shallow copy. Only the pointers to the
TableUpdates are copied, which means one pointer per list (e.g. 5
in total for google4 in a new profile).
MozReview-Commit-ID: 221d6GkKt0M
--HG--
extra : rebase_source : e1b81f11bb9b41e465571a95845079f455b5868e
HashStore::ApplyUpdate() is a V2-only function and so we can be
explicit about that and remove unnecessary casts.
Add a new update error code for when we fail to cast a TableUpdate
object to the expected protocol version.
MozReview-Commit-ID: 65BBwiZJw6J
--HG--
extra : rebase_source : 3f4bb0f7594d4015e2614ef747526ec5e8168a08
Given we're no longer using dependent strings in
LookupCacheV4::PrefixString(), we will end up make a copy of the
prefixes at some point. Let's do it early and remove a bunch of
complicated code.
Make the string copies fallible so that we return an error and
fail the update instead of crashing.
MozReview-Commit-ID: 5cZHSDIJSlD
--HG--
extra : rebase_source : 0ad130c2be9caf528bb764296836e91fa8a30916
We have a minimum requirement of VS 2015 for Windows builds, which supports
the z length modifier for format specifiers. So we don't need SizePrintfMacros.h
any more, and can just use %zu and friends directly everywhere.
MozReview-Commit-ID: 6s78RvPFMzv
--HG--
extra : rebase_source : 009ea39eb4dac1c927aa03e4f97d8ab673de8a0e
We write a lot of 4-bytes prefixes to file which call many system calls.
We should use a buffer and only write to file if the buffer is full or
finish writing. nsIBufferedOutputStream is a good candidate to do that
MozReview-Commit-ID: CzGOd7iXVTv
--HG--
extra : rebase_source : 25f1ce804b9a53e0a0a4023a1aa91f1a0ed98547
In this patch, we will make Safebrowsing V2 caching use the same algorithm as V4.
So we remove "mMissCache" for negative caching and TableFresness check for
positive caching.
But Safebrowsing V2 doesn't contain negative/positive cache duration information in
gethash response. So we hard-code a fixed value, 15 minutes, as cache duration.
In this way, we can sync the mechanism we handle caching for V2 and V4.
An extra effort for V2 here is that we need to manually record prefixes misses
because we won't get any response for those prefixes(implemented in
nsUrlClassifierLookupCallback::CacheMisses).
--HG--
extra : rebase_source : 0fb7938464ad24a01a51493ecb1b5c0197c16123
In Bug 1323953, we always send 4-bytes prefix for completion and the prefix is also
used as the key to store cache result from gethash request.
Since it is always 4-bytes, we could convert it to integer for simplicity.
MozReview-Commit-ID: Lkvrg0wvX5Z
--HG--
extra : rebase_source : 7002b1f55bc0f88ed90340082574cc931f8db87a
This patch includes following changes:
1. nsUrlClassifierHashCompleter.js
nsUrlClassifierHashCompleter.idl
- Add completionV4 interface for hashCompleter to pass response data to
DB service.
- Process response data includes negative cache duration, matched full
hashes and cache duration for each match. Full matches are passed through
nsIFullHashMatch interface.
- Change _requests.responses from array contains matched fullhashes to
dictionary so that it can store additional information likes negative cache
duration.
2. nsUrlClassifierDBService.cpp
- Implement CompletionV4 interface, store response data to CacheResultV4
object. Expired duration to expired time is handled here.
- Add CacheResultToTableUpdate function to convert V2 & V4 cache result
to TableUpdate object.
3. LookupCache.h
- Extend CacheResult to CacheResultV2 and CacheResultV4 so we can store
response data in CompletionV2 and CompletionV4.
4. HashStore.h
- Add API and member variable in TableUpdateV4 to store response data.
TableUpdate object is used by DB service to pass update data or gethash
response to Classifier, so we need to extend TableUpdateV4 to be able
to store fullHashes.find response.
6. Entry.h
- Define the structure about how we cache fullHashes.find response.
MozReview-Commit-ID: FV4yAl2SAc6
--HG--
extra : rebase_source : 02676ded9bc0580c24b4c906199a4abaf004e296
This patch includes following changes:
1. nsUrlClassifierHashCompleter.js
nsUrlClassifierHashCompleter.idl
- Add completionV4 interface for hashCompleter to pass response data to
DB service.
- Process response data includes negative cache duration, matched full
hashes and cache duration for each match. Full matches are passed through
nsIFullHashMatch interface.
- Change _requests.responses from array contains matched fullhashes to
dictionary so that it can store additional information likes negative cache
duration.
2. nsUrlClassifierDBService.cpp
- Implement CompletionV4 interface, store response data to CacheResultV4
object. Expired duration to expired time is handled here.
- Add CacheResultToTableUpdate function to convert V2 & V4 cache result
to TableUpdate object.
3. LookupCache.h
- Extend CacheResult to CacheResultV2 and CacheResultV4 so we can store
response data in CompletionV2 and CompletionV4.
4. HashStore.h
- Add API and member variable in TableUpdateV4 to store response data.
TableUpdate object is used by DB service to pass update data or gethash
response to Classifier, so we need to extend TableUpdateV4 to be able
to store fullHashes.find response.
6. Entry.h
- Define the structure about how we cache fullHashes.find response.
MozReview-Commit-ID: KgR1NASl7GC
--HG--
extra : rebase_source : 424db14e4af2ffd691c384414d50f64083d5d20b
A new function Classifier::AsyncApplyUpdates() is implemented for async update.
Besides, all public Classifier interfaces become "worker thread only" and
we remove DBServiceWorker::ApplyUpdatesBackground/Foreground.
In DBServiceWorker::FinishUpdate, instead of calling Classifier::ApplyUpdates,
we call Classifier::AsyncApplyUpdates and install a callback for notifying
the update observer when update is finished. The callback will occur on the
caller thread (i.e. worker thread.)
As for the shutdown issue, when the main thread is notified to shut down,
we at first *synchronously* dispatch an event to the worker thread to
shut down the update thread. After getting synchronized with all other
threads, we send last two events "CancelUpdate" and "CloseDb" to notify
dangling update (i.e. BeginUpdate is called but FinishUpdate isn't)
and do cleanup work.
MozReview-Commit-ID: DXZvA2eFKlc
--HG--
extra : rebase_source : cd2e27a6b679d2c96e769854d1582ed2dcda12bb
Switch over to using a single fallible array in ByteSliceWrite. This allows us
to only use 2 transient arrays 1/4 the size of the input array rather than 5
transient arrays.