ImageBridgeChild::GetSingleton returns null in the GPU process. This causes
DrawTargetWebgl::CopyToSwapChain to use an incorrect texture type for WebGL
canvases when in the GPU process. To work around this, determine the texture
type for WebGL in the content process and send it to CanvasTranslator for
later usage.
Differential Revision: https://phabricator.services.mozilla.com/D202292
WGR is fairly slow at generating specialized circle geometry, whereas we can
generate similar geometry much faster using the AAStroke filled circle
implementation now.
Differential Revision: https://phabricator.services.mozilla.com/D201939
CopyToSwapChain was silently failing, causing no texture to get pushed
to RemoteTextureMap, so that when a wait on it was occurring, it would
timeout.
The failure occurred in DrawTargetWebgl::FlushFromSkia, because the
DT's size actually exceeded the value of the texture limit pref when
it was attempting to allocate a temporary texture to blend back a
Skia layer to the WebGL framebuffer. This is fixed by allowing layering
to bypass this limit, as it is always expected that layer blending
succeed.
To guard against future instances of this bug, CopyToSwapChain now returns
a boolean result so that it is fallible and can signal to CanvasTranslator
that it needs to take appropriate fallback measures on failure.
Differential Revision: https://phabricator.services.mozilla.com/D199794
When a context loss occurs on DrawTargetWebgl, this may result in a fallback TextureData
being created. Each of these are currently managed by two different RemoteTextureOwnerClients.
This is not really safe at all.
To fix this, CopyToSwapChain is modified so that it can be supplied a RemoteTextureOwnerClient.
Then CanvasTranslator can inject its own RemoteTextureOwnerClient into CopyToSwapChain, rather
than letting CopyToSwapChain use its own separate internal RemoteTextureOwnerClient.
This also tries to address a few other data consistency bugs with the fallback TextureData.
Differential Revision: https://phabricator.services.mozilla.com/D198487
After a minimize, an unknown amount of time or circumstances may be involved that ultimately lead to
a GL context loss. To try to mitigate this, cache software snapshots of DrawTargetWebgls when we are
about to minimize so that these can hopefully be copied into fallback TextureDatas later if the context
is actually lost.
Differential Revision: https://phabricator.services.mozilla.com/D198129
It's wasteful to call DrawTargetWebgl::BeginFrame if we're locking in a read-only mode. It may
also mess up DrawTargetWebgl's internal profiling if we count a read-only use as an actual frame.
Differential Revision: https://phabricator.services.mozilla.com/D197483
Chartjs heavily relies on circle drawing, which dispatches to the FillCircle and
StrokeCircle hooks in DrawTarget. These need to be implemented in DrawTargetWebgl.
Differential Revision: https://phabricator.services.mozilla.com/D194353
This adds the necessary infrastructure for CanvasTranslator to allocate DrawTargetWebgl
instead of just allocating TextureData, and to use RemoteTextureMap to handle sending
the DrawTargetWebgl frames to the compositor.
This optimizes snapshot transport to use fewer copies to and from shmems when we know
the snapshot contents can be sourced from a shmem.
This adds a blocking mechanism separate from deactivation so that existing DrawTargetWebgls
can continue processing events while denying further ones from being created in the event
that allocating further DrawTargetWebgls might cause problems, but so that we don't disrupt
canvases that are already in flight.
PersistentBufferProviderAccelerated still remains the buffer provider for the new setup,
but just allocates a single RecordedTextureData internally. Since DrawTargetWebgl already
does its own swap chain management internally, we do not want to use the multiple texture
client strategy that PersistentBufferProviderShared does.
This adds a fallback mechanism such that if DrawTargetWebgl allocation fails, a TextureData
is allocated instead that still sends results to RemoteTextureMap. This has the advantage
that we don't need to synchronously block in the content process to verify if acceleration
succeeded, as the costs of such blocking are rather extreme and we must still produce the
rendered frame to ensure the user sees the correct result if the speculative acceleration
failed. It then notifies the content process asynchronously via the refresh mechanism to
try to recreate a non-accelerated buffer provider when it is ready.
There is one additional hitch in RemoteTextureMap that we need to add a mechanism to deal
with the setup of the RemoteTextureOwner. When display list building initially needs to get
the remote texture, the RemoteTextureOwner might not exist yet. In this case, we need to add
a block to ensure we wait for that to occur so that we do not render an erroneous result.
Previously, this block was handled in ClientWebGLContext. Since that is no longer used,
the block must be reinstated somewhere else until a non-blocking mechanism for display list
building to proceed with a stub texture host wrapper can be implemented.
Currently this leaves the gfx.canvas.remote and gfx.canvas.accelerated prefs as separate
toggles rather than trying to lump everything into one mechanism. While this may be desirable
in the future, currently Direct2D remote canvas is a separate acceleration mechanism that
needs to co-exist with the WebGL acceleration, and so being able to toggle both on or off
for testing is desirable.
Differential Revision: https://phabricator.services.mozilla.com/D194352
This mostly restructures DrawTargetWebgl to no longer rely upon ClientWebGLContext.
Instead, it must directly interact with WebGLContext which requires some noisy changes
of the GL rendering API used.
In addition, this restructures SharedContextWebgl so that it can be explicitly
allocated and further DrawTargetWebgls can be allocated that feed off of it.
This is all towards the ultimate goal of relying on remote canvas infrastructure for
remoting instead.
Differential Revision: https://phabricator.services.mozilla.com/D194351
Chartjs heavily relies on circle drawing, which dispatches to the FillCircle and
StrokeCircle hooks in DrawTarget. These need to be implemented in DrawTargetWebgl.
Differential Revision: https://phabricator.services.mozilla.com/D194353
This mostly restructures DrawTargetWebgl to no longer rely upon ClientWebGLContext.
Instead, it must directly interact with WebGLContext which requires some noisy changes
of the GL rendering API used.
In addition, this restructures SharedContextWebgl so that it can be explicitly
allocated and further DrawTargetWebgls can be allocated that feed off of it.
This is all towards the ultimate goal of relying on remote canvas infrastructure for
remoting instead.
Differential Revision: https://phabricator.services.mozilla.com/D194351
Chartjs heavily relies on circle drawing, which dispatches to the FillCircle and
StrokeCircle hooks in DrawTarget. These need to be implemented in DrawTargetWebgl.
Differential Revision: https://phabricator.services.mozilla.com/D194353
This mostly restructures DrawTargetWebgl to no longer rely upon ClientWebGLContext.
Instead, it must directly interact with WebGLContext which requires some noisy changes
of the GL rendering API used.
In addition, this restructures SharedContextWebgl so that it can be explicitly
allocated and further DrawTargetWebgls can be allocated that feed off of it.
This is all towards the ultimate goal of relying on remote canvas infrastructure for
remoting instead.
Differential Revision: https://phabricator.services.mozilla.com/D194351
Calling ClientWebGLContext::UniformData() many times causes the
command buffer to fill up and we spend a fair amount of time flushing
the old buffer and allocating a new one, as well as serializing the
values.
The uniforms themselves are very small but they add up over a large
number of calls. We already have some code to track whether the
uniform values are dirty to avoid some redundancy, but a) this doesn't
cover every uniform, and b) we invalidate them all when switching
program.
This patch makes us track the value of every uniform that gets set
dynamically, and tracks the values separately for each program
used. It then uses these to avoid calling UniformData redundantly.
Differential Revision: https://phabricator.services.mozilla.com/D190269
For lines and rects, we don't have to worry about AARect generating overlapping
triangles when alpha is used. In these cases we can avoid drawing to a mask first
and avoid a performance cliff.
Differential Revision: https://phabricator.services.mozilla.com/D180525
Since AAStroke can't deal with non-opaque stroked path, we first generate a normal opaque, anti-aliased
stroked path with AAStroke and render it to a cache texture bound to a render target. We can then later
just use that texture with alpha to support the initial alpha stroke request.
One caveat is that trying to both render to a texture bound to a framebuffer and also upload directly to
it with texSubImage2D can expose bugs in some OpenGL drivers that have different underlying representations
for data textures and for render target textures. To avoid this problem, we segregate the texture cache
pages based on whether they are used as render targets or for direct data uploads.
This ultimately all avoids the fallback of having to draw the alpha stroke in software with Skia and
then upload it to a texture. For stroked paths with large hollow areas, uploading a Skia surface whose
bounds contain the full stroke can cause a lot of uploading of unnecessary pixel data. This allows us
to only upload the triangle mesh for AAStroke and otherwise keep generation solely on the GPU.
Differential Revision: https://phabricator.services.mozilla.com/D180143
This implements some optimizations targeted at Canvas2D's putImageData:
1) Track whether the canvas is in the initially clear state so that we avoid
reading back from the WebGL framebuffer into the Skia framebuffer when a
fallback does occur or when a data snapshot is needed.
2) For surfaces that are too large to upload to a texture, directly use
glTexSubImage2D to draw data to the WebGL framebuffer, bypassing a separate
texture upload.
3) Disregard the surface size limits for SurfacePatterns containing a
compatible texture handle.
Differential Revision: https://phabricator.services.mozilla.com/D171773
If we choose to accelerate a single line path, we need to take care not to use
the line cap when the path is closed. When the path is closed, we need to use
the line join instead.
Differential Revision: https://phabricator.services.mozilla.com/D170469
Skia upstream removed deprecated clip ops that could be used to replace
the clipping stack and bypass clips. We shouldn't really need to do this
anymore, as we can work around it just using public APIs.
The only SkCanvas operation that allows us to bypass clipping is
writePixels, which still allows us to implement CopySurface/putImageData.
Other instances where we were using the replace op for DrawTargetWebgl
layering support can just be worked around by creating a separate
DrawTargetSkia pointing to the same pixel data, but on which no clipping
or transforms are applied so that we can freely do drawing operations
on it to the base layer pixel data regardless of any user-applied clipping.
Differential Revision: https://phabricator.services.mozilla.com/D168039
This updates the version wpf-gpu-raster which adds support for
GPUs/drivers that use truncation instead of rounding when converting
vertices to fixed point.
It also adds the GL vendor to InitContextResult so that we can detect
AMD on macOS and tell wpf-gpu-raster that truncation is going to happen.
Differential Revision: https://phabricator.services.mozilla.com/D167503
CanvasRenderingContext2D relies upon CreateSimilarDrawTarget to create extract
a subrect from a surface to draw. However, DrawTargetWebgl does not return an
accelerated DT for that API as creating an entirely new context can be quite
expensive.
To work around this, this adds a specific ExtractSubrect API for SourceSurface
that can bypass the entire need to create a temporary DrawTarget to copy into.
Differential Revision: https://phabricator.services.mozilla.com/D164118
This pre-allocates a vertex output buffer in DrawTargetWebgl so that we can generate
wpf-gpu-raster and aa-stroke output into it. This way, they don't have to realloc
a Vec for pushes or changing into a boxed slice. This can net 5-10% on profiles for
the demos noted in the bug.
Depends on D163989
Differential Revision: https://phabricator.services.mozilla.com/D163990
It seems like this is slow for now until we implement a better way than WPF-gpu-raster
for stroking paths. Just hide this behind a pref so we can at least test it but not
impact performance as badly.
Differential Revision: https://phabricator.services.mozilla.com/D163248
For use-cases that repeatedly pop and re-push the same clips over and over, we can regenerate the
same mask that is already still stored, because we only detect that clip state changed, rather than
that it changed to exactly the same state it was previously.
This just remembers the previous state of the clip stack at the time the clip mask was generated
so that we can compare the previous and current state. If they're the same, we can assume there
is no need to regenerate the clip mask again and simply reuse it.
Differential Revision: https://phabricator.services.mozilla.com/D162699
WebGL doesn't reliably implement line smoothing, so we can't rely on it, making it
useless for canvas lines. Instead, just fall back to emulating it manually with paths.
Differential Revision: https://phabricator.services.mozilla.com/D162540
Some paths may contain so many types that their vertex representation far exceeds their
software rasterized representation in memory size. As a sanity-check, we should just set
a hard limit on the maximum allowed complexity of a path that we attempt to supply to
wpf-gpu-raster. Beyond that, we will instead just rasterize in software and upload
to a texture which can be more performant.
Differential Revision: https://phabricator.services.mozilla.com/D162481
By default, BorrowSnapshot is pessimistic and forces DrawTargetWebgl to return a data snapshot on
the assumption that the snapshot might be used off thread. However, if we actually know the DrawTarget
we're going to be drawing the snapshot to, then we can check if they're both DrawTargetWebgls with
the same internal SharedContext. In that case, we can use a SourceSurfaceWebgl snapshot which can
pass through a GPU texture to the target. This requires us to plumb the DrawTarget down through
SurfaceFromElement all the way to DrawTargetWebgl to make this decision.
Differential Revision: https://phabricator.services.mozilla.com/D162176
This adds a path vertex buffer where triangle list output from WGR is stored.
Each PathCacheEntry can potentially reference a range of vertexes in this buffer
corresponding to triangles for that entry. When this buffer is full, it gets
orphaned and clears corresponding cache entries, so that it can start anew.
Differential Revision: https://phabricator.services.mozilla.com/D161479
This adds a path vertex buffer where triangle list output from WGR is stored.
Each PathCacheEntry can potentially reference a range of vertexes in this buffer
corresponding to triangles for that entry. When this buffer is full, it gets
orphaned and clears corresponding cache entries, so that it can start anew.
Differential Revision: https://phabricator.services.mozilla.com/D161479
For canvas users that rapidly create and destroy canvases, we may end up creating
a new SharedContext (and hence ClientWebGLContext) if there are no more canvases
left between destruction and creation. To work around this, just keep alive the
SharedContext for the main thread (other threads are unfortunately a bit tricky
to support) so that canvas creation remains fast in this instance.
Differential Revision: https://phabricator.services.mozilla.com/D158904
For canvas users that rapidly create and destroy canvases, we may end up creating
a new SharedContext (and hence ClientWebGLContext) if there are no more canvases
left between destruction and creation. To work around this, just keep alive the
SharedContext for the main thread (other threads are unfortunately a bit tricky
to support) so that canvas creation remains fast in this instance.
Differential Revision: https://phabricator.services.mozilla.com/D158904
For canvas users that rapidly create and destroy canvases, we may end up creating
a new SharedContext (and hence ClientWebGLContext) if there are no more canvases
left between destruction and creation. To work around this, just keep alive the
SharedContext for the main thread (other threads are unfortunately a bit tricky
to support) so that canvas creation remains fast in this instance.
Depends on D158903
Differential Revision: https://phabricator.services.mozilla.com/D158904
Previously we were reusing the framebuffer's Skia DT to render the clip mask.
This was the path of least resistance since SkCanvas does not allow exporting
clip information, and there is no way to reset the bitmap storage inside an
SkCanvas temporarily.
However, this can cause a feedback cycle of unnecessary WaitForShmem operations,
since we need to wait before we can generate the clip mask into the Skia target,
and then anything else after it needs to wait for the clip mask to finish uploading
before the Skia DT can be used again.
To alleviate this, we just allocate a new DrawTargetSkia to render the clip mask
into. We carefully clip the size of the DT so that in the common case we avoid
having to upload a surface the size of the entire framebuffer. Further, since
this is a completely different DT, we can now use an A8 format (1/4 the memory
overhead) instead of a BGRA8 format for the clip mask, which gives a further
memory usage gain.
A further complication is that we need to log the current clip stack state so
that we can replay it onto the new DrawTargetSkia. This avoids having to add
a mechanism to SkCanvas to export clip information.
Differential Revision: https://phabricator.services.mozilla.com/D157050
Sometimes the clip state is thrashed when we need to temporarily override
clipping to disable it. However, in this case, the clip mask itself remains
unchanged. The current invalidation scheme doesn't discern between generation
of the clip mask itself and setting the clip state for the shader, leading to
unnecessary regeneration of the clip mask.
This code just tries to discern when this is happening so we can refresh the
clip state without having to regenerate the clip mask unless truly necessary.
Differential Revision: https://phabricator.services.mozilla.com/D157048